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Abstract 

Stochastic models of reaction networks are becoming increasingly important in Systems Biology. In 
these models, the dynamics is generally represented by a continuons-time Markov chain whose states 
denote the copy-numbers of the constituent species. The state-space on which this process resides is a 
subset of non-negative integer lattice and for many examples of interest, this state-space is countably 
infinite. This causes numerous problems in analyzing the Markov chain and understanding its long-term 
behavior. These problems are further confounded by the presence of conservation relations among species 
which constrain the dynamics in complicated ways. In this paper we provide a linear-algebraic procedure 
to disentangle these conservation relations and represent the state-space in a special decomposed form, 
based on the copy-number ranges of various species and dependencies among them. This decomposed 
form is advantageous for analyzing the stochastic model and for a large class of networks we demonstrate 
how this form can be used for finding all the closed communication classes for the Markov chain within the 
infinite state-space. Such communication classes support the extremal stationary distributions and hence 
our results provide important insights into the long-term behavior and stability properties of stochastic 
models of reaction networks. We discuss how the knowledge of these communication classes can be used 
in many ways such as speeding-up stochastic simulations of multiscale networks or in identifying the 
stationary distributions of complex-balanced networks. We illustrate our results with several examples 
of gene-expression networks from Systems Biology. 

Keywords: stochastic reaction networks; state-space analysis; communication classes; state-space irreducibil- 
ity; stationarity; ergodicity. Mathematical Subject Classification (2010): 60J22; 60J27; 60H35; 65C05. 


1 Introduction 

Many biological processes are described as reaction networks, where finitely many species interact with 
each other through some fixed reaction channels m 1 nn m m. Traditionally, reaction networks have 
been mathematically studied by expressing the dynamics as a set of ordinary differential equations (ODEs). 
However it is now well-known that these deterministic formulations become highly inaccurate when the 
copy-numbers of the reacting species are small. This is because the timing of reactions becomes random, 
introducing noise into the dynamics, which can significantly change the behavior of the system being modeled 
[ZlIHl. Such situations arise commonly in Systems Biology, since intracellular networks often involve species 
with low copy-numbers like gene-transcripts, signaling proteins, messenger RNAs, transcription factors etc. 
015]. The biochemical noise generated by the intermittency of reactions can be taken into account using 
stochastic models of reaction networks. A common approach is to represent the dynamics as a continuous¬ 
time Markov chain (CTMC) whose states denote the copy-numbers of the constituent species [10]. In recent 
years, such models have been extensively used for understanding the role of noise in various biological 
mechanisms mM- 

We now formally describe such a stochastic model for a reaction network. Throughout this paper K, IR+, 
Z, N and No denote the sets of all reals, nonnegative reals, integers, positive integers and nonnegative integers 
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respectively. Consider a network with d species Si,..., which interact through K reaction channels of 
the form 

d d 

^ ^ ^ik ^ ^ ^ Pik^i i ^ 1 ; ■ • ■ ; (^ ■^) 

where and pik belong to Mq and they denote the number of molecules of that are consumed and 
produced by reaction k. Under the classical well-mixed assumption [12] , the network’s state at any time is 
described by a vector x = (si,... ,Xd) G Ng of copy-number counts of all the species, i.e. Xi denotes the 
copy-number of species . The net change in the state due to reaction k is simply given by the stoichiometry 
vector (^k € whose i-th component is Qk = {pik — ^ik) € Z. The usual CTMC model for the reaction 
dynamics nni stipulates that the transition rate from state x to state (x-l-Cfc) is A/c(a;) for each k = 1,..., if, 
where Ai,..., \k '■ Ng ®-i- called the propensity functions for the network and they are assumed to 
satisfy: 

for any cc S Ng and fc = l,...,if if Xk{x) > 0 then (a:-I-Cfc) G Nq- (1-2) 

This condition ensures that the set Ng is a valid state-space for the CTMC, because it is closed under the 
reaction dynamics i.e. if the starting state is in Ng then the dynamics remains in Ng throughout its trajectory. 
Under mild conditions on the propensity functions (see Chapter 6 in m), one can ensure that the CTMC 


{X{t))t>o = (W(t), ■ ■ •, Xd{t))t>o 

with the above transition structure is well-defined for any initial state Xq G Ng. Define the probability that 
the reaction dynamics is at state j/ G Ng at time t by 

Pxo{t,y) ='^{X{t) =y). (1.3) 

Then the dynamics of Pxo {t, •) is given by the Chemical Master Equation (CME) [14] which has the following 
form: 


dpxo{t,y) 

dt 


K 


= X! ^Pxoit, y - (k)Xkiy - Cfc) - Pxo (C y)Afc(j/)), 


(1.4) 


k=l 


for each y G Ng which can be accessed by the CTMC. For most examples in Systems Biology, the number 
of accessible states is either infinite or very large, and hence solving the CME to obtain the probabilities 
PxoitjU) is nearly impossible. One generally estimates these values by simulating the CTMC (X(t))t>g using 
Monte Carlo methods such as Gillespie’s Stochastic Simulation Algorithm (SSA) [12j . Another popular 
approach for obtaining approximate solutions is the Finite State Projection (ESP) method which efficiently 
truncates the state-space to a small finite set and then solves the CME over this finite set [T^. Both SSA 
and ESP based approaches work well for smaller networks and over finite time-intervals, and hence they do 
not help in satisfactorily assessing the long-term behavior and stability properties of the stochastic model. 
For Markov chains over a finite state-space, one way to assess this long-term behavior is by computing the 


disjoint closed communication classes in the state-space (see Section 2.1), using matrix or graphical methods 
m- However there do not exist methods for systematically finding all the closed communication classes for 
stochastic reaction network models with infinite state-spaces. The main goal of this paper is to develop such 
a method that can provably find all such classes under biologically reasonable assumptions on the network. 
This has important implications regarding the long-term behavior of the stochastic model because these 
closed communication classes support the distinct stationary distributions, which are like attracting fixed- 
points for the CME (1.4) in the space of probability distributions. Moreover each closed communication 


class serves as an irreducible state-space for the underlying CTMC (see Section 2.1). For this reason, we will 


generally refer to a closed communication class as an irreducible state-space in this paper. 

Observe that for many networks, the state-space Ng is simply too large, in the sense that it contains 
several extraneous states that are never visited by the dynamics. This is mainly due to the conservation 
relations present in the network which impose constraints on the copy-number ranges of the involved species. 
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Hence our first task is to develop a systematic procedure to weed-out these extraneous states and obtain 
a smaller non-empty set £q which is also a valid state-space (i.e. ( 1 . 2 ) holds with Ng replaced with £q). 


Furthermore we would like the representation of £o to be explicit enough to ensure that the copy-number 
ranges of various species can be easily identified. Apart from enabling the search for irreducible state-spaces, 
which is the primary goal of this paper, this explicit state-space representation allows us to gain a better 
understanding of the network dynamics. We now explore these issues in a greater detail. 

Let us return to the CTMC {X{t))t>o starting at some initial state a;o S Ng. The ideal or the smallest 
state-space £ for this process would simply be the set of all states in Ng that the reaction dynamics has a 
positive probability of reaching in a finite time[^ i.e. 


£ = {y ■Pxoit.y) > 0 }, 


(1.5) 


for some f > 0. The set £ is non-empty (as Xq & £) and using Chapman-Kolmorogov inequalities (see [17]) 
one can easily check that £ is a valid state-space and so the CTMC {X{t))t>o resides in this set throughout 


its trajectory. However the set £ is difficult to characterize in cases where the CME (1.4) cannot be solved 
and so the probabilities Pxoit, •) are unknown. Hence we look for a bigger set £q which contains £ and is 
also a valid state-space. The standard choice is the stoichiometry compatibility class (see [ini) defined by 


£o = (xq -f Range(S')) n Ng, 


( 1 . 6 ) 


where S = Col(Ci,..., Ck) is the d x K stoichiometry matrix whose columns are the reaction stoichiometry 
vectors and Range(S') is the range or column space of S. To see the containment £ C £o note that if y G £ 
then y must be reachable from xg in finitely many transition steps in each of the directions ^i,..., . Let 

be the number of steps needed in direction k, then setting r = (ri,..., xk) & N^ we must have y = xq Sr 
which ensures that {y — xq) G Range)^) and hence y G £o- We next write £o in terms of conservation 
relations for the network, which are the nonzero vectors in the left nullspace of the stoichiometry matrix S 

C{S) = {jGR^ = (1.7) 


where 0 is the vector of all zeros in For any nonzero 7 = ( 71 ,... , 7 ^) G £(S), the relation j'^S = 0 ^ 
implies that for each reaction k = 1,... ,K, the stoichiometry vector is orthogonal to 7 , i.e. ( 7 , fk) = 0, 
where (•, •) denotes the standard inner product in Therefore the stochastic reaction dynamics {X{t))t>o 
will satisfy 


( 7 , ^(i)) = ( 7 , ^( 0 )) = ( 7 , xo) for all t > 0 , 


which means that the copy-numbers of species included in the support set 


( 1 . 8 ) 


supp( 7 ) := = 1 ,..., d : 7 , ^ 0 } 


(1.9) 


conserve the linear constraint specified by vector 7 . Suppose that for some integer n > 0 we have Rank(S') = 
(d — n) and so the dimension of the subspace C{S) is n. Let { 71 ,... , 7 n} be a basis for C{S) and define a 
d X n matrix F and a n x 1 vector c by 

F = Col( 7 i,... , 7 „) and c = V'^xq. (1-10) 


From (1.8) and the orthogonality of vector spaces L{S) and Range(S'), we can equivalently express the set 
£g as 


£0 = {a; G Ng : F^a; = c}. (l-H) 

Taking a cue from the terminology used in |18j . we call F as the conservation matrix, c as the conservation 
vector and the pair (r,c) as the conservation data for the network. If the subspace C{S) is trivial (i.e. 
u = 0) then £g = Ng, but in the other situation when this subspace is nontrivial (i.e. n > 0), the states in 

^The choice of this finite time is not important because for a CTMC if Pxo{t,y) > 0 for some t > 0 then the same holds for 
all t > 0 (see Theorem 3.2.1 in |17| 1 
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Nq outside Eq are not reachable by the dynamics because £q is a valid state-space for the CTMC {X{t))t>o 
and it contains the initial state X( 0 ) = Xq. 

The representations (1.6) or ( |1.11| ) for Eq are quite abstract, making it difficult to get a sense of the 
dynamics and particularly the copy-number ranges of all the species. We remedy this problem by developing 
a method that systematically screens the space of conservation relations C{S) and expresses Eq in a more 
explicit form. We briefly describe the steps needed for this purpose. We call a nonzero vector 7 G C{S) a semi¬ 
positive conservation relation if all its nonzero entries are positive. Up to independence, the only other type 
of conservation relations are the mixed-sign ones which have at least two nonzero entries with opposite signs. 


Semi-positive conservation relations ensure that the species in their support-sets (see (1.9)) are bounded in the 
sense that their copy-numbers have a bounded range. This is evident from (1.8) because if 7 is a semi-positive 
conservation relation then for any i G supp( 7 ), we have 7 i > 0 and since ( 7 , X(<)) = X]jGsupp( 7 ) 
must also have Xi{t) G [0, (7, a;o)/7i]. We scan the space of semi-positive conservation relations and identify 
all the bounded species along with their suitable copy-number ranges. It is possible that these semi-positive 
relations do not span the whole space of conservation relations C{S), and so we then look for any mixed-sign 
conservation relations among the remaining unbounded species. We show that such conservation relations 
force a certain subset of unbounded species (called restricted species) to mimic the dynamics of the remaining 
unbounded species (called free species) according to an appropriately constructed affine map. For example, 
the following ATP-hydrolysis reaction occurs in many living cells: 

ATP (adenosine triphosphate) -I- H 20 (water) ADP(adenosine diphosphate) -f P(phosphate). (1-12) 


Generally ATP and H 2 O molecules are present in high concentrations in the cytosol, and so the dynamics 
of low copy-number species Si = ADP and S 2 = P can be well approximated by the simple network 


S1 + S2 


(1.13) 


The stoichiometry matrix for this network is 


S = 


1 -1 

1 -1 


and the space C{S) of conservation relations has dimension 1. One can see that there are no semi-positive 
conservation relations and the only independent conservation relation is 7 = ( 1 ,- 1 ) which is mixed-sign. 
Moreover (1.8) forces Xi{t) = (j){X2{t)) for all t > 0, where </> is the affine map given by (j){x) = x -\- ( 7 , xq). 

In large networks there are several conservation relations and each species can participate in many such 
relations. To account for all the possibilities, we will employ standard linear-algebraic methods, such as 
basic matrix manipulations, solving linear-algebraic systems and Linear Programs (LPs) [19j . to classify 
each species as one of three types: /ree, bounded or restricted. Under fairly general conditions satisfied by 
most biological networks, we prove that by relabeling the species, we can express the state-space Eq in a 
special decomposed form 


EQ=EbX $, 

where E^ is a finite set in Nq*” and <i> is the graph of an affine function (f 
nonnegative integer orthant, i.e. 


(1.14) 


restricted to the 


$ = 


|(a;,(/)(a;)) : X S Nq-^ and (/)(x)eNQ''| 


(1.15) 


Here df,db and dr are nonnegative integers denoting the number of /ree, bounded and restricted species 
respectively. The finite set Eb GL Nq'’ contains the dynamics of bounded species, while the infinite set 
4) C Ng^"*"'^'’ serves as a state-space for the dynamics of both free and restricted species, with the latter being 
“locked” in a fixed affine relationship (given by function (/>) with the former. Notice that in comparison to 


both (1.6) or (1.11), the form (1.14) for state-space Eq is more explicit as it clearly expresses the copy-number 


ranges for each species as well as the relationships among them. This enables a better understanding of the 
dynamics which can be leveraged to improve the efficiency of existing analytical methods. For instance. 
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one can use the form (1.14) to design optimal state-space truncations for the Finite State Projection (FSP) 
method for solving CMEs TS] . One can also use this form as a guide for automatically separating the high- 
copy-number and the low-copy-number species for the method of conditional moments (MCM) approach 
m or for deriving suitable hybrid approximations of the dynamics m- We do not explore these ideas in 
this paper but instead focus on using this form for the analysis of the long-term behavior of the underlying 
Markov process. 

Suppose there exists a conservation relation in C{S) such that all its d components are strictly positive. 
Such a situation generally occurs when the network satisfies some form of global mass conservation relation 
and in this case all the species are bounded (i.e. di, = d, df = dr = 0), and so the state-space Sq = £b is finite. 
Using elementary matrix or graph-theoretic methods m we can easily find all the closed communication 
classes or the irreducible state-spaces (see Section 2.1) £i,... ,£q within So, where the dynamics eventually 
lies starting from any initial state xq- These classes are mutually disjoint and each Sg supports a unique 
stationary distribution for the stochastic reaction dynamics. Moreover these distributions tti, ..., ttq form 
the extremal points of the simplex S formed by all the stationary distributions of the network 


E = 


^ Q 

TT = aqTTg : 
^ q=l 


each ai,...,aQ>0 and 


Q 

= 1 


(1.16) 


Typically for reaction networks in Systems Biology, a conservation relation with all strictly positive entries 
does not exist. This is because such networks are abstract representations of the actual processes where 
many details, such as the dynamics of abundant species (like ATP molecules, enzymes etc.), are intentionally 
omitted to make the analysis more tractable and pertinent to a given problem. Consequently, unlike classical 
chemical kinetics, global mass conservation fails because mass is allowed to be created and destroyed. Hence 
it is important to consider this general case, where free and restricted species are present, and the state- 
space So must be necessarily infinite. In such a scenario the stationary distributions do not always exist, but 
their existence can be guaranteed by other methods, like those developed by Meyn and Tweedie [221 [23]. In 
particular. Theorem 4.5 in [23] shows that stationary distributions will exist if one can find a Foster-Lyapunov 
function V So ^ R+ satisfying V{x) ^ oo as ||x|| —)■ oo, such that the K+-valued process {V{X{t)))t>o 
experiences a negative drift outside some compact (finite) set C C So- In a recent paper [21] we develop a 
computational framework for constructing such Foster-Lyapunov functions for a large class of biochemical 
reaction networks which includes several well-known examples from Systems Biology. 

Consider a stochastic reaction network with an infinite state-space So for which an appropriate Foster- 
Lyapunov function exists. In such a setting, the issue of existence of stationary distributions is resolved but 
the structure of the simplex E of stationary distributions is still unknown. The extremal points tti ,..., ttq of 
this simplex are the (unique) stationary distributions supported on all the irreducible state-spaces Si,... ,Sq 
within the infinite state-space So- The primary goal of this paper is to develop a method that explicitly 


identifies all these irreducible state-spaces using the decomposed state-space form (1.14). For this purpose 
we adapt and generalize the ideas contained in |25| , with the main observation being that for most biological 
networks, the free species can be organized in the form of birth and death cascades, depending on the minimum 
number of reactions that the species requires to be created from nothing (denoted by 0) or get reduced to it. 
As our examples suggest, for many Systems Biology networks these cascades have a natural correspondence 
to gene expression or signaling stages in the dynamics (see Section]^. Combining this cascade construction 
along with matrix methods that are used in the finite state-space case, our method provably determines all 
the irreducible state-spaces for a large class of networks satisfying some biologically reasonable criteria. It 
is interesting to note that for many networks, simply knowing these irreducible state-spaces allows one to 
easily compute the corresponding stationary distributions (see [26] and Example [6.5[ ). Even if the stationary 
distributions are not computable, the knowledge of all the irreducible state-spaces has several important 
consequences. Eor example, if the initial Ai(0) of the stochastic reaction dynamics {X{t))t>o lies in the 
irreducible state-space Sg, then the ideal state-space S (se e (]I.5[ )) consisting of only the reachable states, 
coincides exactly with Sg- Then due to Theorem 1.10.2 in [17). for any bounded real-valued function / on 
Sg we have 


lim E(/(A(t))) = ^ f{y)TTg{y) 

v&£q 


(1.17) 
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and the following limit holds with probability 1 


and lim ^ f f{X{s))ds= ^ f{y)Trg(y). 


(1.18) 


Using (1.17) one can show that various statistical quantities (means, variances, covariances etc.) associated 
with the stochastic reaction dynamics converge to their steady state values as t —)■ oo (see |24jl. which 
is helpful in the design of controllers that can robustly steer the mean copy-number of some species to 


specific reference values m- Relation ( |1.18[ ) shows that the stationary distribution of the population can 
be computed by evaluating the proportion of time spent in various states by a single stochastic trajectory 
(X{t))t>o over a long period of time. Such an insight can help in leveraging experimental techniques such 
as Flow-Cytometry and Time-Lapse Mieroscopy in the study of isogenic cell populations (see [H]). One can 
also use (1.18) to speed-up the estimation of the stationary distribution using Monte Carlo simulations. 
Moreover the knowledge about the exact support of iTq (viz. £q) can be used to sample from this stationary 
distribution more efficiently. Commonly we find that for networks arising in Systems Biology there is only 
one irreducible state-space £i and hence the simplex E collapses to a unique stationary distribution tti (see 
Section]^. In such a scenario the underlying CTMC is ergodic, and relations (1.17) and ( |1.18 1 hold for any 
bounded real-valued function / on the full state-space Sq and for any initial condition X (0) € £o ■ Checking 
ergodicity of networks is important for many applications. For example, under ergodicity one can apply tools 
from Transition Path Theory |28j to study the topology of networks by analyzing the statistical properties 
of trajectories that flow between two subsets of the irreducible state-space. Another important application 
area where checking ergodicity is crucial, is for speeding-up the stochastic simulations of multiscale networks, 
which have reactions firing at multiple timescales |29j . Such networks are common in Systems Biology and 
it is known that their exact stochastic simulation, using Gillespie’s SSA for example, is highly cumbersome, 
because most of the simulation time is spent in generating the fast reaction events. To circumvent these 
problems approximate simulation approaches have been developed [30l[3U|32] that apply the quasi-stationary 
assumption (QSA) on fast ergodic subnetworks, by supposing that their stochastic dynamics relaxes to 
stationarity between subsequent reactions at the slower timescales. These fast subnetworks can change 
their structure, along with their ergodic properties, depending on the states of the slow variables which 
may determine the set of available fast reactions. In this context our results provide a way for automated 
discovery of fast ergodic subnetworks, during the simulation run, and aid the correct application of QSA. 
We illustrate this point through an example in Section |6.4[ This example also shows how restricted species 
can arise naturally when a fast subnetwork within the bigger network, is considered in isolation. 

This paper is organized as follows. In Section we introduce some preliminary concepts that will be 
used throughout the paper. The method for computing the decomposed form of state-space is explained in 
Sectionand the process for finding all the irreducible state-spaces is described in Section]^ The algorithms 
for implementing these procedures are provided in Section In Section we illustrate the applicability and 
usefulness of our methods for state-space analysis by considering several examples from Systems Biology. 
In particular we discuss how this state-space analysis can contribute towards our understanding of the 
underlying biological process. The detailed proof of our main result, Theorem |4.4| is given in Section and 
finally in Section we conclude. 


Notation 

Most of the notation used in this paper has already been introduced in Section [T] However some additional 
notation and clarifications are needed which we mention now. 

For any set A, we denote its cardinality by |A|. The vector of all zeros in any dimension is denoted by 0 . 
Similarly in any dimension d, the i-th standard basis is denoted by Ci and it is the vector whose Fth entry is 
1 while the rest are zeros. The identity matrix is denoted by I. For any mx n matrix M and any k < I < m, 
the (^ — fc -I- 1) X n matrix formed by rows (fc -\-1), {k -\- 2),... ,l of matrix M is denoted by Proj(M, k, 1). If 
Vi, ... ,Vn are the columns of M then for any A C K, the set Colspan^(M) stands for 

n 

X G : X = OiVi for some oi, 


e A 
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The dimension of any vector space V is denoted by dim(y) and this vector space is called trivial if dim(F) = 0. 
While multiplying a matrix with a vector we always regard the vector as a column vector. All inequalities 
involving vectors or matrices must be interpreted componentwise. 


2 Preliminaries 


Consider the reaction network A/" described in Section with d species and K reactions of the form (1.1|. 


Let A : 


pK 


_i_ be the propensity map given by 

A{x) = {Xi{x),...,Xk{x)). 


(2.19) 


We represent this network by the triplet Af = {V,0,A), where V and O are two dx K matrices whose entries 
at row i and column k are given by and pik respectively. We call these matrices as the reactant matrix 
and the product matrix respectively, because they tabulate the number of molecules of each species that 
are created and removed by each of the reactions. Note that the stoichiometry matrix S for this network 
is simply S = {O — V). The left nullspace C{S) of this matrix is the space of all conservation relations for 
the network and as explained in Section]^ the constraints imposed by these relations can be described by 
some conservation data (F, c), and with this data at hand the designated state-space Sq is given by (1.111. 
In the rest of this section we present some concepts and assumptions associated with the reactions networks 
we consider in this paper. 


2.1 The reachability relation and irreducible state-spaces 


For any x,y G Sq let Px{t,y) (see (1.3|) be the probability that the stochastic dynamics starts at x and 
reaches y at time t. If Px{t, j/) > 0 for some t > 0, then we say that state y is reachable from state x, and we 
denote this relation as 


M 

x — >y. 


( 2 . 20 ) 


This relation does not depend on the time-value t (see Theorem 3.2.1 in ini) and it is transitive: i.e. for 

). We say that two states x,y G £q 


any a;, y, z S if x y and y z then x z (see Chapter 6 in 
communicate if we have both x —> y and y — > x, and this relation is denoted by 


At 


y- 


j\r 


It is known that i—> is an equivalence relation on So and it partitions the set Sq into disjoint equivalence 
classes which are referred to as communication classes in the Markov chain literature (see |17]'). The com¬ 
munication classes can be further classified as closed or open. A communication class S G Sq is called closed 
if and only if for any x G £ and y G So, H x —> y then y G S. By definition a closed communication class £ 
needs to be closed under the reaction dynamics (i.e. holds with Nq replaced with £), and so it is a valid 
state-space for the underlying CTMC. Moreover as all the states in £ are reachable from each other, £ is an 
irreducible state-space for the CTMC. Therefore we often refer to such a set £ as an irreducible state-space 
in the paper. 

When the state-space So is hnite, then all the irreducible state-spaces correspond exactly to all the 
positive-recurrent classes for the underlying Markov chain (see |17jL and starting from any initial state 
in Sq, the CTMC will eventually get trapped in one of these disjoint irreducible state-spaces. The same 
holds true for infinite state-spaces if there exists a Foster-Lyapunov function on So (see Section]^ and [23]). 
For CTMCs, the property of positive-recurrence of a class £ C fo is exactly the same as having a unique 
stationary distribution supported on £. 

The problem of finding closed communication classes for stochastic reaction networks is quite challenging, 


as establishing the reachability (2.201 between any two states x,y G So is tantamount to showing that there 
exists a sequence of n reactions ki,... ,kn G {1,..., K} such that y = x -|- Cki and Xk^ {zj) > 0 for each 
j = 1,..., n, where Zj = x -I- Cfei • These conditions ensure that starting from state x, firing of reactions 
ki,...,kn in this order, takes the state to y, and this firing of reactions is a positive-probability event 
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because at all the intermediate states (zj-s), the propensity Xk^izj) for the next reaction in this sequence is 
positive. When the state-space Sq is finite, matrix methods (see [H]) can be used for computing the closed 
communication classes, without having to explicitly find any positive-probability reaction sequences between 
states. In this paper we provide an extension of this approach that can handle a large class of biological 
networks with infinite state-spaces (see Section]^. One of the main difficulty that we have to deal with is 
that some reaction channels may switch-ojf at certain states, due to their propensities being zero at those 
states, and hence the set of possible transition directions is not the same for all the states in £q. To account 
for this switching-off of reactions we need to impose some conditions on the propensity functions, as we 
discuss next. 


2.2 Conditions on the propensity functions 

To facilitate the search for irreducible state-spaces, we shall assume that the network N = (V, O, A) satisfies 
the following: 

Assumption 2.1 For each reaction k = 1,... ,K and each x = (a;i,..., Xd) S Eq, we have \k{x) > 0 if and 
only if Xi > Vik for every i = 1,... ,d. 

In other words, at state a;, reaction k has a positive probability of firing if any only if for each species S^, 
the number of available molecules {xf) exceeds the number of molecules consumed by the reaction {uik). 
The “only if’ part of this condition is almost always satisfied, because a reaction cannot fire unless for each 
species, the required number of molecules are present for consumption, but the “if’ part of this condition 
may get violated if the propensity function for a reaction is zero even though all the required molecules 
for having the reaction are present. However such situations do not typically arise for networks in Systems 
Biology as we now explain. 


Observe that Assumption 2.1 is certainly satisfied if we have mass-action kinetics [7] where each propen¬ 
sity function A/c : Nq —>■ K+ has the form 


Afc(x) =dkW 


Xi{xi - 1)... (xj - Vik + 1) 


Vik'- 


( 2 . 21 ) 


for some rate constant 9k > 0. Apart from mass-action kinetics, networks in Systems Biology generally 
have propensity functions describing either Michaelis-Menten kinetics for enzyme-substrate interactions or 
Hill kinetics for ligand-protein binding dynamics |d4j . In both these cases, the propensity functions have a 
rational form Afc(x) = pk{x)/qk{x), where the denominator qk{x) is always positive and the numerator pk{x) 


satisfies the criterion in Assumption 2.1 Consequently the network is of mass-action type (i.e. it satisfies 
Assumption |2.1[ ) even though the propensity functions are not of mass-action form (2.21). Furthermore even 
if a network does not satisfy Assumption |2.1[ it can often be modified in such a way that this assumption is 


satisfied and its dynamics remains the same (see Section 6.1). We end this section with a simple proposition. 


Proposition 2.2 Consider a network Af satisfying Assumption 2.1 with conservation data (T, c) and state- 
space So given by (1.11). Then the reachability relation is positively additive on £oi which is to 
say that for any x,y G Sq and z G Nq, if x y and {x -\- z) G Eg then we also have (y -\- z) G Eg and 
(x-Gz) ^ {y-G z). 


Proof. This is a simple consequence of Assumption |2.1| because for any reaction k and state m S £oj if 
Xk{u) > 0 then Xk{u -G z) > 0 for any z G Nq. Therefore the positive-probability reaction sequence that 
leads the state from x to y, also serves as a positive-probability reaction sequence that takes the state from 
{x -G z) to {y -G z) (recall the discussion in Section 2.1). □ 


2.3 Inverse of a reaction network 

We now define the inverse A/inv = (Mnv, C>inv, Ainv) of the reaction network N = (V, O, A), which is obtained 
by flipping the arrows in ([T 3 . In other words, the K reactions in A/inv are given by 

d d 

^ ^ Pik^i ^ ^ ^ 
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( 2 . 22 ) 










Hence Vmv = Oinv = V and we define the propensity map Ainv(a;) = (Ai 4 nv{a;) 
each Afc, 

stoichiometry matrix S\ 


, A/f,inv(a:)) by letting 

to have the mass-action form (see (2.21)) with the rate constant 0/^ = 1. Observe that the 
for A/jnv is simply the negative of the stoichiometry matrix for M, and so the space 


of conservation relations (1.7) as well as the state-space £q (1.11) remain the same for both networks. 

By construction, A/inv always satisfies Assumption 2.1 and if the original network N also satishes this 
assumption then we have the following correspondence between the reachability relations induced by the two 
networks 


N 


y if and only if 


y 


Mn. 


for any x,y G Sq. 


(2.23) 


To see this correspondence suppose that under network AC, the state can reach y from x in one reaction step. 
In such a scenario for some reaction k we have Xkix) > 0 and y = x + (^k- Since Xi > Vik and Cife = (pik — Vik), 
we have yi > pik for each i = 1,... ,d, which ensures that Xk,inv{,y) > 0 and so under the inverse network 
A/jnv, the dynamics can reach state x from state y by a single firing of reaction k. Extending this idea to 


incorporate a sequence of intermediate states and reactions (see Section 2.1) one can conclude that (2.23) 
holds. 


2.4 Reaction network under a permutation 

Our description of the stochastic model for network JV = {V,0,A), stipulates that if the state is x = 
{xi,... ,Xd) then Xi denotes the copy-number of species i (viz. SJ. However in order to simplify the 
representation for state-space we would often need to redehne the the correspondence between the species 
and the location of their copy-numbers in the state vector. This can be conveniently done by permuting the 
original network to obtain a dynamically equivalent network, as we describe below. 

We denote the set of all species labels by I? = {1,..., d}. Let ct : I? —)■ I? be any permutation (one-to-one 
and onto) map. Let P„ be the d x d permutation matrix given by 

Pa = Col (eo-i(i),..., ea-i(d)) ) (2.24) 


where ei,..., are the standard basis vectors in K' 
= PaO and the propensity map : Nq — 


and let a ^ denote the inverse of map tr. Let = PaV, 
be as in (2.19) with each Xk replaced by A^ given by 


X%{x) = Xk{Pjx) for each fc = 1,..., AT, 


(2.25) 


where Pj = Hcr-i denotes the transpose of matrix Pa- We dehne the permuted network as A/'®’ = 
(V*^,O®’, A”^). One can see that is dynamically equivalent to N in the following sense. If (A(t))t>o 
represents the stochastic reaction dynamics for network N then {X'^{t))t>o defined by 

X'^(t) = PaX{t) for all t > 0, (2.26) 


represents the dynamics for the permuted network . In other words, if X'^{t) = {xi,..., Xd) then for each 
i = 1,... ,d, Xi denotes the copy-number of species a{i) at time t. Therefore the appropriate conservation 
data and state-space for network are (To-, c) and respectively, where 


= PaT and £q = Pa£o ■= {PaX : x G £q). 


(2.27) 


One can also see that if network N satisfies Assumption |2.1[ then the same holds for the permuted network 
A/'®’. Due to the dynamical equivalence (2.26) we have the following proposition. 


Proposition 2.3 A set £^ GL £q is an irredueihle state-space for network Af‘^ if and only if the set £ = 
Pa £'^ C £q is an irreducible state-spaee for network Af. 


3 Computing the decomposed form of state-space 


The aim of this section is to provide a procedure to obtain the decomposed form (1.14) for state-space £o 
for a network N = (V, O, A) with conservation data (r,c). For this purpose, we may need to permute this 
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network according to some permutation cr and work with the equivalent network with state-space 
(see Section 2.4). 


The decomposed form can be constructed in two simple steps. Firstly by scanning the space of all semi¬ 
positive conservation relations, the bounded species are identified and their appropriate finite state-space 


£b is found (Section 3.1). Secondly the rest of the species are classified as free or restricted depending on 


any mixed-sign conservation relations between them and the afhne function (j) (see (1.15)), which gives the 
static relationship between these two sets of species, is determined (Section [T^ . The detailed algorithm for 
performing state-space decomposition is presented in Section 


3.1 Identifying the bounded species and their state-space 

Note that the space of all conservation relations can be expressed as C{S) = {To : a G K"}. Suppose that 
7 = Fa is a semi-positive conservation relation for some a G M". From (1.8) and ( 1.10| ), one can see that 
that for any i G supp( 7 ) we have 


0<X,(t) < 


(c, a) 


for all t > 0, 


(3.28) 


where Xi(t) denotes the copy-number of species i at time t. This shows that species i is hounded, but since 
it may be involved in several semi-positive conservation relations, the upper-bound {c,a)/ji for its copy- 
numbers may not be sharp. To systematically account for all these relations and obtain a sharp upper-bound 
bi we solve the following Linear Program (LP): 


bi = min (c, a) 

aGR"' ^ 

subject to Fa > 0 and {ei,Ta) = 1, 


(3.29) 


where is the f-th standard basis vector in If the feasible region of this LP is empty, then hi = oo 
and species i is not involved in any semi-positive conservation relation. Henceforth we partition the set of 
all species V = { 1 ,... ,d} into the set of bounded species = {i G V : bi < oo} and the set of unbounded 
species Vu = {i G T> : bi = oo}. Let dt = and du = IT’mI = d — db he the cardinalities of these two sets. 

Choose a permutation map ui : V ^ V satisfying 


cri(0 G 


Vb 


ioT I = 1,... ,db 

for I = {db + 1),. ■. ,d 


(3.30) 


and consider the reaction dynamics of the network under permutation a = ai (see Section 2.4). Now the 


entries in rows 1,... ,db of the state vectors will contain the copy-numbers of bounded species in Db. These 
copy-numbers, arranged as vectors in Nq'’, will always lie in the finite rectangular set 


^6 = G No" : 0 < a;/ < b^n) for each I = 1,...,4}, 


(3.31) 


but all the elements in this set may not be reachable from each other due to conservation relations among 
hounded species (see Section]^. We deal with these conservations relations now. 


Let Sa be the stoichiometry matrix for network Af°' and let (Fcr,c) be its conservation data (see (2.27)). 
Setting := Proj(S'CT, 1, db), the conservation relations among the bounded species are given by nonzero 
vectors in its left nullspace C{S^). Suppose Ub = dim (£(5'^)) > 1 and let { 71 ,... , 7 nt} denote a basis for 
C{S^). For each j = 1,..., rib, let 7^- = (7^, 0 ) G K"* and set Cj = (a^, c) where Uj G M" is the unique solution 
of the linear-system T^aj = 7 ^-. For the permuted network , the state vectors for all the bounded species 
in Db will always lie in the finite set 


£b = {a; G TZb ■ = Cj for each j = 1,..., rib}. (3.32) 

3.2 Identifying the free and the restricted species 

We now partition the set of unbounded species, into a set Df oi free species and a set D^ of restrieted 
species. Letting 5“ = Proj(S'cr, db + l,d), we define the number oi free species df and the number of restrieted 
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species dr by 


df := Rank(S'“) and dr := du — df = d — di, — df. 


(3.33) 


Note that the total number of species (d) is equal to the sum of the number of free (df), bounded (db) 
and restricted species {dr). Observe that dr is the dimension of the left nullspace C{S)f) of matrix S'“, 
which corresponds to the space of all conservation relations among the unbounded species. Any nonzero 
vector 7 S C{S)f) must necessarily be a mixed-sign conservation relation, because otherwise the species in its 


support set (1.9) would be bounded which is not the case. Recall that n = dim(£(S'cr)) is the total number of 


conservation relations while nt are the number of conservation relations among the bounded species. Hence 
we must have n > Ub + dr- As it turns out, usually for biological networks (see Section]^ this inequality is 
strict 

n = Ub + dr, 

which ensures that there are no conservation relations involving both the bounded and the free/restricted 
species. Note that this condition can also be expressed as 


Rank(S'<,) = Rank(5'^) -k Rank(S'“), 


(3.34) 


where = Proj(S'cr, 1, db) and 5'“ = Proj(5'cr, db + 1, d) as before. We will assume ( |3.34 | from now on. 

Suppose that restricted species exist (i.e. dr > 1) and the space /1(5'“) in nontrivial. Let ■ ■ ■ ,5'^ } 
be a basis for C{S)f). For any subset I = {it,... ,idj} C {1,..., d„} with |/| = df elements, let A/ be the 
du X du matrix given by 


Ai = Col (^e^^,...,ei^^,d[,...,S'ri^'^ , 

where e^-s are the standard basis vectors in . Define another set 

If = {I C {1,... ,du} ■ \I\ = df and Rank)^/) = d„}. 


(3.35) 


(3.36) 


Note that this set is nonempty and its cardinality is bounded above by ()^’(). 

Fix a, I G If and let denote its complement in the set {1,..., d„}. We define the set T>f of free species 
and the set Vr of restricted species as Vf = {a{db + i) : i G 1} and T>r = {cr{db + i) : i G /“}. These two sets 
partition the set T>u. We now choose another permutation map CT 2 :1? —2? satisfying 


a' 2 {l) = ai{l) for / = 1,..., db and cr 2 {l) G 


'Df 

Vr 


for 1 = (dfc + 1 ),..., {db 
for 1 = (db + d/ + 1 ),... 


-df) 

,d. 


(3.37) 


Let be the network under permutation a = 02 (see Section \2.A \ , and let Sa and (F^, c) be its 
stoichiometry matrix and conservation data respectively. For each i = 1,... ,dr, the vector 5i = (0, 5[) G 
belongs to C{Srr) and hence the vector 5i = PuPjSi belongs to C{Srr). Since the permutations cti and (T 2 
are identical on {!,... ,db}, each Si must have the form Si = (0, d-^^) for some vectors and 

G Define a df x dr matrix Ai = Col(dj^\ ..., d^^^) and a dr x dr matrix A 2 = Col(d|;^\ ..., d^^^). 
Observe that if A/ is the matrix given by (3.35), then there exists a d„ x d„ permutation matrix Q such 
matrix QAj has the form 


QAj = 


I Ai 

0 A 2 


(3.38) 


where I is the df x df identity matrix and 0 is the dr x df matrix of all zeroes. Matrix Aj is invertible 
because I G If, and hence matrix A 2 is also invertible. 

From now on let = Proj(S'cr, 1, db), S)f = Proj(S'cr,db + l,d), Sf = Proj(S'“, 1, d/) and 5^ = 
Proj(S'“,d/ + l,df + dr). For each i = l,...,dr, the vector d,; 

0 denotes the vector of zeros in 


= ( 0 ,d|^\d|^^) belongs to C{Srr), where 


Hence we must have = —A^Sf = 0 which allows us to write 


s: 


= -{A^)-^Afsf = -{AfYAlsf. 


(3.39) 
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This also shows that Rank(S'^) = dj because d/ = Rank(S'“) = Rank(5'“) (see (3.33)) and has the form 




■ si ■ 



. Sf . 


[ -(A2-i)^Af J 


Si. 


The state-space for network A/”is £q = P^So = {a; G Nq : T^x = c} (see (2.27)). Any element in this set 
can be expressed as x = (x{,,x/,Xr) where Xf, G x/ S Nq'^ and Xr G Ng''. Since the columns of To- span 
C{Scr), for each i = 1,..., dr, there exists a unique n x dr matrix M such that 

r,M = Coi(?i,...,?dj. 


Since (3.34) holds, the last [df + dr) rows of the condition T^x = c yield Af’x/ -|- A^x^ = c for c = M^c. 
This shows that Xr = 4>{xf), where the affine map (p : —>■ is dehned by 

(fix) = {Al)-'^c-{A'l)-^Afx. (3.40) 

This analysis proves the following proposition. 


Proposition 3.1 Assuming condition (3.34) holds the state-space £f for network can be expressed as 

£o = £b X (3.41) 


where d* is the graph (1.15) of function 4> defined by (3.40). 

Note that this state-space decomposition result only depends on the reaction stoichiometries but not on their 
propensities. Indeed the propensity functions can be completely general as long as they satisfy the basic 
assumption (1.2) which ensures that the dynamics is contained in the positive orthant. We end this section 
with an important remark. 

Remark 3.2 Note that the classification of unbounded species into free and restricted species depends on 
the set I which can chosen to be any element in the set If given by (3.36). This flexibility will be useful in 
the next section. 


4 Identifying the irreducible state-spaces 


In this section we shall assume that network N = (V,0, A) satisfies Assumption 2.1 
the permuted network = (V' 


As a consequence, 


also satisfies this assumption, which is a property that will 
play a crucial role in our search for irreducible state-spaces within the infinite state-space £f,. Recall that 

these state-spaces are the closed communication classes for relation on £q (see Section |2.l[ ) . From the 
discussion in Section |3.2| it is immediate that restricted species have no independent dynamics of their own 
and they essentially mimic the free species according to the affine map (p. This suggests that for finding 
irreducible state-spaces we can simply remove the restricted species and concentrate on the dynamics of the 
bounded and the free species. We now describe this step formally. 


4.1 Network reduction by elimination of restricted species 

We construct a “reduced” network with {db-\-df) species in the set Vb^Pf where Vb = {o’(l), • ■ ■ ,<^{db)} 
and Vf = {a{db -I- 1),..., cr{db -I- df)}. For each reaction k = 1,... ,K, let its propensity function Xf : 


N?!'’ X Nn^ 


where Xf, € Ng’’, x/ S Ng^ and the function Xf is defined by (2.25). Let the propensity map A'^ : Nj 


be given by 


Nf 


Xf{xb,Xf) = Xf{xb,xf,(p{xf)) 


(4.42) 


jdi+df 


be as in (2.19) with each Xk replaced by Xf. Setting = Proj(V'^, \,db-\-df) and , l,db-\-df), 

we define the reduced network as ,0"^, PN). 
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Unfortunately the reduced network may not satisfy Assumption 
work does 


2.1 


even though the original net- 


see Example 6.4). We need to avoid this problem because our approach requires this property. 
Fortunately this can be done by exploiting the flexibility in the choice of set / (see Remark 3.2), which 
classifies each unbounded species as free or restricted. Note that different choices of I will yield different 
reduced networks but they correspond to the same dynamics for the original network. Hence the irreducible 
state-spaces for the original network can be found with any I chosen as per our convenience. We sequentially 
examine each element in the finite set If (see ( |3.36| )) until we find a / for which the affine function (j) satisfies 
the following assumption (see Algorithm [^. 

Assumption 4.1 Consider an affine map f : —>■ 


f{x) = Fo+ Fix, 

where Fq is a vector in and Fi is a d^. x df matrix. We say that this map is compatible with network 
= (V*^, A*^) if Fq and Fi have all the entries in No and the matrix inequality 

Fol^-bFiV; > v; (4.43) 

holds, where V? = Proj(V‘^, db -I- 1, db -I- df), Vf = Proj(V‘^, db + df + l,d) and 1 is the vector of all ones in 


If the affine map </> satisfies this compatibility condition, then the network reduction will automatically 
satisfy Assumption 2.1 To see this note that (4.43) implies that for any reaction k, if Xf > Vkj then 


fixf) > ^rj, where Vkj denotes the k-th. column of matrix VJ {Vf). Therefore the reduced network 

satisfies Assumption 2.1 because 


Xl{xb,xf) > 0 


Xl{xb,xf,(j){xf)) > 0 

{Xb,Xf,(j){Xf)) > {l'k,b,l'k,f,Vk^r) 

{xb,Xf) > {vk,b,i'kj) and (j){xf)>yrj 

{Xb,Xf) > {vkp.Vkj), 


where Vk,b is the fc-th column of matrix Proj(V'^, 1, db) and denotes “if and only if’. 

It is easy to check that the reduced network A/"'’’ preserves all the semi-positive conservation relations 
among the bounded species. Therefore if 4> satisfies Assumption |4.1| the state-space for this reduced network 
is 


= £^xN 


df 

0 ■ 


In what follows, we shall identify all the irreducible state-spaces for the communication relation —> induced 
by the network Af'^ on £q . The next proposition then allows us to recover all the irreducible state-spaces for 
the original network Af°'. 


Proposition 4.2 Assume that the affine map f, given by (3.40), satisfies Assumption f.l Then for any 

F C and G C Nq'^ , the set F x G is an irreducible state-space for relation > if and only if the set 

F X is an irreducible state-space for relation <—>■. Here $<3 is the graph of function f (see (1.15)^, 
restricted to the domain G. 


Proof. The proof follows simply from the construction of the reduced network Af'^ and the fact that the 
dynamics of the restricted species is “tied” to the dynamics of the free species according to map f. □ 


4.2 Networks with only bounded species 

We first consider the case when there are no free species (i.e. df = 0), and hence the state-space £q = £^ is 
finite. In such a situation all the irreducible state-spaces can be found using simple matrix manipulations. 
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We briefly describe this approach in the context of reaction networks and introduce the relevant concepts 
that will be useful later in the paper. 

Let = {yi ,..., yN^} be the finite state-space with W = \£^\ states. For each reaction k, let G Ng*’ 
and Cj, € be the vectors denoting the fc-th colu mn of the reactant and the stoichiometry matrices of the 
network. As network satisfies Assumption 


2.1 


only the following reactions have a positive propensity of 


firing 


Define a x Nt matrix Z by 


Zij — 


^r{y) = {fc = 1 ,..., a: : y > vl}. 


1 if yj = yi + Ck foi' some k G ICriVi) 
0 otherwise. 


(4.44) 


(4.45) 


We can view Z as the zero-pattern matri >gof a finite Markov chain m and use it to study reachability and 
communication relations corresponding to network Af'^ (see Section 2.11. Define the x Nh reachability 
matrix by 


il={I + Z) 


Nt-l 


(4.46) 


where I is the Ni, x identity matrix. For any two states yi,yj G £^, state yi is reachable from state pj 
if and only if the *j-th entry of this reachability matrix is positive (i.e. fly > 0). Therefore based on this 
matrix we can define the communication relation 0 on £^ by 

0 = X £^ : > 0 and Vtji > 0}. (4.47) 

As we mentioned in Section EH 0 is an equivalence relation on £^ which partitions this set into distinct 
equivalence or communication classes. Let 0 = [0ij] be the IVf, x N}, matrix representing this relation, i.e. 
Qij = 1 if [yi,yj) G 0 and 0^ = 0 otherwise. One can verify that states pi and yj will belong to the same 
communication class if and only if rows i and j are identical in matrix 0. Let Uc be the number of unique 
rows of matrix 0 and let U be the Uc x Ni, matrix formed by these rows. Each row i oi U corresponds to a 
distinct communication class made up of those states pj for which Uij = 1. To study the interaction among 
communication classes we define another Uc x Uc matrix R by 

R = UZU"^ - I. (4.48) 


This matrix captures the reachability relations among the communication classes. In other words, if Rij > 
0 , then there exists a reaction that takes a state in the Tth communication class to a state in the j-th 
communication class. If the Tth row of matrix R only consists of zeros then the i-th communication class 
in closed and otherwise it is open. Starting from any initial state in £^ the dynamics of the bounded species 
will eventually get trapped in one of the closed communication classes and these are the only irreducible 
state-spaces for network without any free species. 


4.3 Networks with only free species 

In this section we assume that there are no bounded species (i.e. db = 0) and so all the irreducible state- 
spaces must lie inside the nonnegative integer orthant £q = Ng^. We will find the irreducible state-spaces 
by adopting a simple scheme that attempts to arrange the free species into birth and death cascades. We 
begin by formalizing the notion of birth-cascades for the network Af'^ = (V*^, O'^, A°’). 

For each reaction k, let and p% be the vectors in Ng , denoting the k-th. column of matrices V’^ and 
respectively. As df, = 0 the set of all free species is Vf = {cr(l),..., a{df)}. From now on let J" = {1,..., d/} 
denote the set of addresses of all the free species under the map cr(-). For any A C R, let 17^(4) be the 
subset of free species given by 


VJ{A) := {a{i) GVfi&A) 


(4.49) 


^The zero-pattern matrix corresponding to a finite Markov chain is obtained by setting all the positive entries in its probability 
transition matrix to 1 and all the rest to 0 


14 





and let B(A) C be defined by 

]B(A) = {i ^ T ■. i ^ A, supp(^'^) C A and i S supp(; 0 ^) for some reaction fc = 1,..., K} . 

The set 'Dj{M{A)) represents the free species that do not belong to the set 'D'^{A) and are produced by a 
reaction that only consumes the free species in 'DJ{A). 

Using this mapping B, we define a sequence of subsets of indexed by nonnegative integer levels 
Z = 0 , 1 ,..., as follows: let Gq = 0 and for each Z > 1 let 


Gz =Gz_iUB(G,_i). 


This sequence of sets {G; : Z = 0,1,2,... } represent the birth-cascades for network . At any level I, the 
set 'D'j{Gi) consists of all those free species that either belong to the previous cascade I?^(G/_i) or it is 
produced by a reaction that only consumes the free species in this previous cascade I?^(G/_i). In particular 
for level I = 1 the set I?^(Gi) consists of all those free species that can be produced from nothing, due to 
reactions of the form 0 —> S. From this cascade construction one can expect that all the free species in 
each of these birth-cascades, can have arbitrarily high copy-numbers due to the reaction dynamics. This is 
precisely what we show later in the paper (see Lemma 7.1). 

As the number of free species is finite, the following number is well-defined 


lb = max{Z e N : Gz ^ Gz_i} = min{Z G N : B(Gz-i) = 0}. 

The monotonically increasing sequence of sets Go,Gi,... stops growing beyond level lb and so we have 
Gz = for all I > lb- We define the Birth-Cascade Path (BCP) as the directed graph 

0 ^ Gi ^ G 2 ^ ^ Gz, (4.50) 

and refer to Gi^ as its terminal node. The set of all the free species that can be arranged into birth-cascades 

B = VJ{Gi,). 

We define two reaction-sets as 


Ar-= {fc = 1,..., AT : supp(z^fe) C GzJ and Ap = (fc = 1,..., AT : supp(pfc) C Gz J. (4.51) 

The set K-r (Ap) consists of all those reactions that do not consume (produce) the free species that lie outside 
the set B. Since Gz,, is the terminal node of the BCP we must have that 

ICr C /Cp, (4.52) 


which is to say that any reaction that only consumes the free species in B must not produce any free species 
outside this set. ^ 

Analogous to birth-cascades we now construct the death-cascades for network Af'^, but we restrict our 
attention to only the species in B and only the reactions in ICr- Let Af '^jB ) be the network restricted to only 
these species and reactions and let be its inverse (see Section 2.3). We construct the birth-cascades 

for this inverse network as before, and define the Z-th death-cascade Gz for network as the Z-th 

birth-cascade for network Correspondingly the Death-cascade Path (DCP) for network Af®’ is the 

BCP for network A/’;^^(;B) and we can represent it as 


0^Gi=»G2^---^Gz,, 


(4.53) 


where Gz^ is the terminal node of DCP and the set 

X = V^Gu) 

consists of all the free species that can be arranged into death-cascades. Note that since the set of species is 


restricted to B we must have X C B. The dynamical relation (2.23) satisfied by a network and its inverse 
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provides us with the following interpretation of death-cascades: the free species in T>f{Gi) for level I = 1 
degrade spontaneously due to reactions in of the form S —>■ 0, while for higher levels I > 1 each free 
species in DJlGi) is converted by a reaction in ICr to some free species S^- at a lower level, i.e. species Sj 
belongs to the set For most biological networks, spontaneous degradation in very common and 

so typically the first death-cascade to be heavily populated (see Section [^. The species in B that 

do not belong to I?J(Gi) usually belong to one of the higher death-cascades because generally they undergo 
a sequence of conversions to eventually produce a spontaneously degradable species (see Section]^. These 
remarks suggest that for most biological networks we can expect to have 

B = X, (4.54) 


which is to say that all the free species in B can be arranged into death-cascades by the procedure we just 
described. 

For our main result we need to ensure that each and every molecule of the free species in B can be 
flushed-out from the system. Due to this reason we need to impose certain stoichiometric restrictions on the 
network as we describe now. Let be the df x K stoichiometry matrix for network and let [/Cp] be 


the matrix formed by restricting S^- to only those columns that correspond to reactions in ICp (see (4.51)). 
For any i ^ T the free species a(i) € Vf is said to be singularly-degradable if 


-Cj C Colspaupj^ [^p] j , 


(4.55) 


where denotes the i-th standard basis vector in . Note that if a free species can degrade spontaneously 
because of a reaction like S —> 0 then it is certainly singularly-degradable. Hence condition (4.55) primarily 


pertains to those free species that require a sequence of conversion reactions to produce a spontaneously 
degradable species. We can easily identify all the singularly-degradable free species by computing a modified 
Hermite normal fornj^of the transpose of the matrix — S'o- [^p]- For obtaining this normal form, the only 
admissible operation is the addition of one row to another, possibly after multiplication by a positive integer. 
Once this Hermite normal form has been computed, the free species a{i) is singularly-degradable if and only 
if there exists a row with the leading entry of 1 in column i of the Hermite normal form. 


We now state the main result of this section Proposition 4.3 which determines an irreducible state-space 
for network and also provides a way to easily check if it is the only irreducible state-space for the network 

^ df 

in the infinite state-space = Nq'^ . To state our main result, we shall consider the network dynamics under 
the permutation a — : V ^ V defined by 





for i = 1 ,..., d/t, 

for i = {dfb .,df 

ioT i = {df -\-1,..., {df -\ 


dr') , 


(4.56) 


where dfb = \B\ be the total number of free species that can be arranged into birth-cascades, the set B is 
B = {ji,... ,jdft,} and the set of remaining free species is Vf OB'^ = {mi ,..., mdj-dfb}- Observe that under 
this permutation the copy-numbers of all the free species in B occupy the first dfb components of the state 
vector. 


Proposition 4.3 Assume that network satisfies Assumption 


db = 0). Suppose that permutation a is defined by (4.56) and (4.54) holds. Then we have the following: 


2.1 and it has no bounded species (i.e. 


(A) If all the free species in B are singularly-degradable then Nq^’’ x {0} is an irreducible state-space for 
network Af'^, where 0 denotes the {df — df^b)-dimensional vector of zeros. 


(B) Additionally if all the free species in TAf H B'^ are also singularly-degradable then Nq^’’ x {0} is the 
only irreducible state-space for network Af'^. 

^The Hermite normal form is an analogue of the row-reduced echelon form for integer matrices. For more details see m- 
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Proof. This proposition is a special case of Theorem |4.4| that is proved later in the paper. □ 

We end this section with a simple example that emphasizes the importance of imposing the condition of 
being singularly-degradable on the free species. Consider a network in which a single species S participates 
in the following two reactions 


0 ^ 2S ^ 0. 

We assume that the propensity functions follow mass-action kinetics. In this network the molecules of S are 
produced and degraded in pairs. One can easily check that for this network B = X = {S}, but No is not 
an irreducible state-space because species S is not singularly-degradable. One can check that this network 
has two irreducible state-spaces: the set of all odd integers O = {1, 3, 5,... } and the set of all even integers 
E = {0,2,4,...}. 


4.4 Networks with both free and bounded species 

In this section we consider the general case where network Af°' = (V^, 0“^, A°’) has both free and bounded 
species and so all the irreducible state-spaces will lie in the infinite set x Nq-^ . We now develop a 


procedure to identify all these state-spaces by intertwining the matrix-based approach in Section 4.2 with 
the birth-death cascade construction described in Section 4.3 However instead of linear path-like cascades 


we need to construct tree-like structures as we explain below. 

We first fix some notation. For each reaction k = I,..., K let Vi, , pf and pf. denote the k-th. column 
of matrices Proj(V'^, I, db), Proj(V‘^, db-fl, df,-|-d/), Proj(C>'^, 1, df,) and Proj(0‘^, df,-f 1, df,-l-d/) respectively. 
Recall that the set of all free species is Vf = {(j{db -I- 1),..., cr(db -f d/)} and let = {1,..., d/} denote the 
set of addresses of all the free species under the map a{db-l- ■)■ For any A C X, let 'DJ{A) cVf he given by 
(4.49) with cr(-) replaced by cr(dt, -f •)• Corresponding to such a set A we define a set of reactions as 


ICr{A) = {k = l,...,K : supp(z/^) C A}, 


which includes only those reactions that do not consume any free species that lie outside the set T>J{A). 
For any y G £b define 


K^r{y, 4 ) = ICriy) n ICriA), 


where ICr{y) is given by (4.44). Let Z{A) be the zero-pattern matrix defined by (4.45) with JCriy) replaced 


by lCr{y,A). Using the procedure described in Section 4.2 we can compute the associated communication 


relation 0(4) (see (4.47)) along with the corresponding set of equivalence or communication classes. We can 
also compute the reachability relations R{A) (see (4.48)) among these classes and determine which classes 
are closed and open. From now on let C{A) denote the set of closed communication classes under relation 
0(4). 

Suppose 4 i ,42 are two subsets of F such that 4i C 42 . Then certainly K,r{Ai) C K,r{A 2 ) and hence 
Zij{Ai) < Zij{A 2 ) for each i,j. Consequently 0 ( 42 ) has fewer equivalence classes than 0(4i), and each 
class of 0(4i) is contained in a unique class of 0 ( 42 ). Let Ci and C 2 be two closed communication classes 
in the sets C(4i) and C( 42 ) respectively. We say that Ci reaches C 2 if either Ci C C 2 , or Ci is a subset of 
an open class O for 0 ( 42 ) and the closed class C 2 is reachable from O under relation i?( 42 ). We define a 
map from C(4i) to a subset of C( 42 ) by 


^(Ai,yi 2 )(C'i) = {C 2 e C( 42 ) : Cl reaches C 2 }. 

Observe that this map is transitive in the sense that for any 4i C 42 C 43 C if (72 S 4 


C 3 € 'I'(A 2 ,A 3 )(C' 2 ) then C 3 G ((7i). 


(Ai,A 2 )(C'i) and 

This transitivity holds because 1Ct{Ai) C /Cr(42) C JCr{Af). 

We now develop the notion of a Birth-Cascade Tree (BCT) by generalizing the ideas in Section 4.3 This 
tree is developed in levels or generations indexed by nonnegative integers I = 0 , 1 ; 2 ,..., and it a directed 
graph with nodes in the set 


T = {((7,4) : 4 c and (7 e C(4)} 


(4.57) 
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and edges in the set T x T. For generation I = 0 the BCT is initialized by simply adding the nodes (C, 0) 
for each C S C(0). For any generation / > 1, the BCT is extended as follows: for each node (C, A) that was 
added to the BCT in the previous generation (Z — 1) we compute the set 

B(C', A) = {i & F : i ^ A and there exists a fc S ICr{C, A) such that i G supp(p^)}, (4.58) 


where 


JCr{C,A)= y JCr{y,A), 

yec 

is the set of those reactions that only consume free species in the set Vf (A) as reactants and have a positive 
probability of firing when the dynamics of bounded species is in the closed communication class C. If 
B(C', ^ 0 then we set A' = A\J B(C, A) and for each C S we add the node {C ,A') to the 

BCT along with the directed edge [C,A) ^ (C',A'). 


The interpretation of the set B(C', is similar to the interpretation of B(>1) in Section 4.3 with the only 
difference being that now the states of bounded species can move around in the closed communication class 
C. As the number of free species is finite, the BCT will stop growing beyond some generation Ih and this 
point the construction of BCT is complete and any node (C, A) from which there are no outgoing edges is 
called a leaf of the BCT. For such a leaf node we have 


B(C,A) = 0, (4.59) 

and there exists a set of {I + 1) BCT nodes, {{Ci, Gi) : i = 0,... ,1} such that Go = Ci = C, Gi = A and 
the BCT has the following directed path 


(Go,Go)^(Gi,Gi) 


(CuGi). 


(4.60) 


The transitivity of the map 'b mentioned above implies that for any i G {0,..., (Z — 1)} we have Gi G 
^(Gi,G()(^i)- Moreover since the set of reactions JCr{Gi),JCr{G 2 ), ... is monotonically increasing, if Gq C Gi 
then Gi C Ci for each i = 1,..., (Z — 1). 

Let L be the set of all leaf nodes in the BCT. A leaf node (G, A) S L is called minimal if there is no 
other leaf node that is strictly contained in (G, A) i.e. if (G',A') S L, G' C G and A' C A then C = G' 
and A = A' . Henceforth we denote the set of all minimal leaf nodes by Lmi„. Our main result of the paper. 
Theorem |4.4[ will show that under certain conditions these minimal leaf nodes exactly characterize all the 
irreducible state-spaces for the network in the infinite state-space x . However before we 

state this result we need an appropriate notion of death-cascades and we need to impose some stoichiometric 
restrictions on the network as in Section [4.31 

Pick a minimal leaf node (G, A) G Lmin and let N'^(G^A) be the network formed by removing all the 
free species outside TAJ {A) and discarding all the reactions outside /Cr(G, A). Moreover we also replace the 
finite state-space of the bounded species with its subset G. This replacement is feasible because (4.59) 
holds ensuring that reactions in lCr{C,A) cannot produce any free species outside V'j{A), and G is a closed 
communication class for the state s of bounded species under these reactions. Let J\f(^^{G,A) be the inverse 
of network M'^{G,A) (see Section 2.3). We compute the BCT for network A/)^^(G, A) and refer to it as the 


Death-cascade Tree (DCT) for network A). Let Ld(G, A) denote the set of all leaf nodes of this DCT. 

Note that for any leaf node (G', A') G Ld(G, A) we certainly have G' C G and A' C A. We say that (G, A) is 
death-exhaustive if there exists a leaf node (G', A') G Ld(G,A) such that A' = A. If such a leaf node exists 


then it is the only leaf node in Ld(G, A) since G' = G. This is shown in the proof of Lemma 7.2 

Consider a minimal leaf node (G, A) G Lmin and let lCp(A) be the set of all those reactions that do not 
involve any bounded species and do not produce any free species outside the set I?j(A), i.e. 


/Cp(A) = {k = l,...,K : supp(t/^) = supp(p^) = 0 and supp(p^) C A}. 


(4.61) 


Let Sa = Proj(5'o-, db -\- l,db -\- df) be the df x K matrix formed by the last df rows of the {dt -b d/) x K 
stoichiometry matrix for network and let S^[ICp{A)] be the matrix formed by restricting So- to only 
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those columns that correspond to reactions in ICp{A). Recall that the set of all free species is Vf = {a{db+i) : 
i G J-}. For any i G jF we say that the free species a{db + 1 ) G T>f is singularly-degradable w.r.t. A if (4.55) 
is satisfied with matrix Sa[K,p{A)]. To obtain the irreducible state-space corresponding to the leaf node 
{C,A) G Lmiii) we shall permute the network according to the permutation 173 (A) : I? —^ I? which ensures 
that the copy-numbers of all the free species in 'D'j{A) occupy the components {db -I- 1),..., (df, -I- |v4|) of the 
state vector. Such a permutation can be defined by setting is to be the same as (T 2 on the set {!,... ,d{,} 


and letting its image on the set {{db -I- 1),..., {db + d/)} be determined by (4.56) with B = V'j{A). We now 
state the main result of our paper. 


2.1 


Theorem 4.4 Assume that network satisfies Assumption 
L and Li„in be the sets of all leaf nodes and minimal leaf nodes for the BCT of the network, 
the following: 


and its state-space is £q = x Nq^ . Let 

Then we have 


(A) Pick any minimal leaf node {C^A) € Lmin. Suppose that this leaf node is death-exhaustive and all 
the free species in 'D'j{A) are singularly-degradable w.r.t. A. Then the set C x x {0} is an 

irreducible state-space for network permuted according to permutation a = (j^{A), where 0 denotes 
the {df — \A\)-dimensional vector of zeros. Additionally if all the free species in the set Vf H 

are also singularly-degradable w.r.t. A then C x x { 0 } is the only irreducible state-space for 
network Af'^ that can have a nonempty intersection with the set C x Nq^ . 

(B) Suppose that all leaf nodes are minimal (i.e. L = each leaf node {C,A) € L is death exhaustive 

and all the free species in Vf are singularly-degradable w.r.t. A. Then up to the relabeling of free 
species inVf, all the irreducible state-spaces for network are given by 


C X n{,^' X {0} for all (C, A) G L. 


Proof. This theorem is proved in Section and it generalizes Proposition |4.3| by taking the dynamics of 
bounded species into account. □ 

We end this section with a couple of useful remarks. 

Remark 4.5 Commonly in Systems Biology networks (see Section there is only leaf node {C,A) and 
there is only one class Cq in C{tb) such that Cq C C. Hence all leaf nodes are minimal (because L = Lmin = 
{{C,A)}) and one can see that the BCT reduces to a single path 


(Co,0) =» (Ci,Gi) ^ ^ {Ci,,Gi,) = {C,A). 


In such a scenario we can talk about the birth-cascades for the network in the same way as in Section \7.2\ 
In particular T>'j{Gi) is the set of free species that belong to the l-th birth cascade. If the leaf node {C,A) 
is death-exhaustive as well, then the Death-Cascade Tree (DCT) for network Af'^{C, A) is also a single path 
and so we can also talk about the death-cascades for the network in the same way as in Section\7.2{ 


Remark 4.6 Suppose from part (A) of Theorem 4-4 we obtain an irreducible state-space for network Af'^. 
Then the co rresp on ding irreducible state-space for the original network JC can^be easily identified using 
Propositions 4-2 and 2.3 Similarly if all the irreducible state-spaces for network can be found using part 
(B) of Theorem \M then we can identify all the irreducible state-spaces for the original network J\f using 
Propositions\4.^ and\2.3\ 


5 Algorithms 


The aim of this section is to provide detailed algorithmic descriptions of various procedures that can be 
used to apply the results in this paper. We start with network J\f = (V, 0,A) with d species in the set 
D = {1,... ,d} and K reactions of the form (see Section]^. We assume that this network satisfies Assumption 
2.1 and its conservation data is (F, c). Our first goal is to find a decomposed state-space £q of the form 
(1.14) under some suitably constructed permutation a : D ^ D. This is accomplished in the method 
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FindDecomposedStateSpace(-) (see Algorithm [T). This method starts by identifying the bounded species 
and finding their optimal state-space (see Section ]3T] ). It then computes the numbers of free (df) and 
restricted {dr) species, and if dr > 0, then it tries to classify the unbounded species into free and restricted 



The outputs returned by FindDecomposedStateSpace(-) are the permutation map cr and the decomposed 
state-space £q. 


function FindDecomposedStateSpace(A/’, T, c) 

For each species i solve the LP (3.29) to compute bi. 

Set Vb = {i G V : bi < 00} and = {i G V : bi = 00} to be the sets of bounded and unbounded 
species respectively. Also set dt = \T>b\ and du = \'Du\- 

Select the permutation map cti : V ^ V according to (|3.30|) . 


Set cr = tJi and construct the permuted network (see Section 2.4) along with its conservation 
data (Ter, c). 


Compute the finite set according to (3.32). This is the state-space for bounded species in Vb. 
Evaluate df and dr according to (3.33). 
if dr = 0 then 

Output: The decomposed state-space is £q = £b x under permutation a. 
else 


Algorithm 1 Finds a decomposed state-space £q of the form (1.14) for a network Af = {V,0,A) with 
conservation data (T, c) 

Require: Network Af satisfies Assumption |2.1| 

1 
2 
3 

4: 

5: 

6 

7 

8 
9 

10 
11 
12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 


Verify condition (3.34) and if this condition is not true then return QUIT 


Compute the set 1/ given by (3.36). 
for all I Gif do 

Set TAf = {a{db -I- i) : i G 1} and Vr = {(j{db + i) : i G /'^} to be the sets of free and restricted 
species respectively. 

Select the permutation map <72 '■ TA —>■ TA according to (3.37). 


Set cr = (72 and construct the permuted network along with its conservation data (To-, c) 
Define the affine map (f : - 

(j) satisfies Assumption |4.1| 


according to (3.40). 


if (j) satisfies Assumption |4.1| then 

Output: Map (j> is compatible with network 
Exit the for-loop and go to step 
end if 
end for 


Let be the graph of (j) given by (1.15) 


Output: The decomposed state-space is fg = x 4) under permutation cr 

end if 

end function 


If the network has any restricted species (i.e. dr > 0), 


then we construct the reduced network = 
by systematically removing the re- 


(V, 0,A) along with the associated permutation cr = cr 2 (see (3.37)) 

stricted species as described in Section |4.1| Assuming thnt a compatible affine map (f was _ discovered by 

which is nec- 


2.1 


FindDecomposedStateSpace(-), the reduced network A/"'^ will also satisfy Assumption 
essary for the process of identifying all the irreducible state-spaces. This process is accomplished by our 
next method FindirreducibleStateSpaces(-) (see Algorithm]^, which works by first constructing the 
Birth-Cascade Tree (BCT) for network A/"'^ and then examining all the minimal leaf nodes of this BCT (see 
Section 4.4). The construction of BCT is performed by method ConstructBCT(-) (see Algorithm]^ which 
returns the set of leaves L of the BCT. 
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Algorithm 2 Identifies the irreducible state-spaces for network = (V, O, A) in the state-space £§ x 

Require: Network only has bounded and free species in the sets Pf, = {cr(f) : i = 1, ■ ■ ■, db} and 
Vf = {a{db + i) : i = 1, ..., df} respectively, 
function FindirreducibleStateSpaces(A/’'^, x Nq^) 

Let L = ConstructBCT(A/’'^,£^ x Nq-^) 

Identify the set of minimal leaf nodes Lmin C L 
Initialize L' = 0. 
for all (C, A) G Lmin do 

Let (C, A) be the network formed by removing all the free species outside 'D’j (A) and discarding 
all the reactions outside lCriC,A) (see Section 4.4). 

Let A/^^^(C, A) be the inverse of network A) (see Section 2.3). 

Set hd{C,A) = CONSTRUCTBCT(Af^^(C',A),C' X 


7 

8 
9 

10 

11 : 


Let JCp{A) be the set of reactions given by (4.61). 

Construct the matrix S„[K,p{Ay\ by restricting the stoichiometry matrix for network A/”®^ to only 
its last df rows and only the columns that correspond to reactions in /Cp(A). 

Compute the set of free species I?sd(A) C Vf that are singularly-degradable w.r.t. A i.e. 


Vsd{A) = {a{db -\- i) : i = 1,... ,df and (4.55|) is satisfied with matrix Scr[ICp{A)]}. 


if hd{C, A) = {{C, A)} AND Vsd{A) C := {a{db +i):iGA} then 


Set a — (73 (A) (see Section 4.4) and construct the permuted network JV^ (see Section 2.4). 


Output: The set CxM}/^' x {0} is an irreducible state-space for network under permutation 
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20 
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22 : 

23: 


if Vsd{A) = Vf then 

Update L'^L'U{(C', A)}. 

end if 
end if 
end for 

if L = L„iin = L' then 

Output: The irreducible state-spaces found in step are the only irreducible state-spaces for 
network in the state-space £^ x Ng^ up to the relabeling of free species in I?/, 
end if 

end function 


For each minimal leaf node (C, A) S Lmin the method FindirreducibleClasses(-) checks if the set 
C X X {0} is an irreducible state-space for network Af'^ under the permutation a = cr 3 (A) (see Section 


4.4) along with whether all th e free species are singularly-degradable w.r.t. A. If both of these are true, 
then part (A) of Theorem 4.4 tells us that C x n[)^' x {0} is the only irreducible state-space which can have 


common elements with the set C x Nq^. If this is the case for all minimal leaf nodes and L = Lmin, then 
using Findirre duc ibleCla sses (-) we recover all the irreducible state-spaces for our network due to part 
(B) of Theorem 


4.4 


Remark 


4.6 


tells us how these irreducible state-spaces for network Af'^ can be mapped 


to the corresponding irreducible state-space for the original network J\f. 
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Algorithm 3 Constructs the Birth-Cascade Tree (BCT) for network = (V,C>,A) with state-space 
dj 


£§ X No 


This BCT is a directed graph with nodes in the set T given by (4.57). 


Require: Network only has bounded and free species in the sets Vh = {u{i) 
Vf — {cr(db + i) i = 1,..., df} respectively. 


= 1,..., db} and 


1 

2 

3 

4 

5 

6 

7 

8 
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10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 


function ConstructBCT(A/''^, x Ng^) 

Initialize / = 0, L = 0 and Go = {(G, 0) : for each G S C(0)}. 

Add each node in Gq to the BCT. 
repeat 

Set Gi+i = 0 

for all (G, A) e Gi do 

Compute the set B = B(G, A) where the operator B is defined by (4.58). 
if i? ^ 0 then 
Set A' = AUB. 
for all G' G do 

Update G;+i ■(— G;+i U {(G^, A^)}. 

Add the node (G', A') to the BCT along with the directed edge (G, A) 

end for 
else 

Update L^LU{(G, A)} 

end if 
end for 

Update ^ ^ Z -I-1 

until Gi = 0 

Output: The set of leaf nodes of the BCT is L. 

end function 


(G',A'). 


6 Examples 


In this section we illustrate our results using several networks from Systems Biology. We start by considering 
a family of simple Gene-Expression networks which illustrate various theoretical ideas developed in this paper 


(see Section 6.1). Next we consider a couple of Circadian Clock models (Section 6.2) and a Bacterial Heat- 
Shock response model (Section |6.3[ ). The networks underlying these models have many species and reactions, 
and we discuss how the state-space analysis presented in this paper can help in understanding network 
design as well as the long-term behavior of the associated stochastic models. In Section |6.4| we provide a 
simple example of a Toxin-Antitoxin network to demonstrate how our results can aid the automatic, real¬ 
time application of the quasi-stationary approximation to speed up stochastic simulations of the multiscale 
network. Finally in Section [6.5[ we present a class of networks, where our analysis along with certain existing 
results, provide the exact stationary distribution for the stochastic model. 

In all the examples, the reactions propensity functions (Afc-s) are assumed to have the mass-action form 
(2.21) unless otherwise stated. For correct interpretation of our results we provide a “Species Chart” that 
encodes the names of network species into the notation used in our paper i.e. Si, S 2 ,.... Throughout this 
section the copy-number of species S^ is denoted by Xi. Often we would need to permute the network (see 
Section |2.4[ ) for our analysis. The permutation a for which the final results are shown is given as a vector 
a = ( 17 ( 1 ),..., cr(d)), and under this permutation the i-th component of the state-vector corresponds to the 
copy-number of species So-(i). 


6.1 Family of Gene-Expression networks 

We now consider several variants of the simple Gene-Expression network given in [36] . In these networks there 
is a Gene (G), which is responsible for the transcription of massenger RNA or mRNA (M) molecules, that 
later translate into the Protein (P) of interest. Both mRNA and Protein molecules degrade spontaneously 
and as in m we allow for the Gene to switch between an active (Gon) and an inactive (Goff) state. For now 
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we assume that the transcription of mRNA is only possible in the active gene state Gon- 

The first model (see Network 0 in Figu re [T] A) we examine consists of 4 species (see the Species Chart in 
Table[^ and 6 reactions displayed in Figure [1^. For this network there is only one (independent) conservation 
relation 7 = ( 1 , 1 , 0 , 0 ) which is semi-positive, and so there are two bounded species Si and S 2 , whose copy- 
numbers xi and X 2 are “locked” in the relation a:i -I- 0:2 = c = 1, which simply says that the total number 
of Gene copies c is conserved by the dynamics. The other two species S3 and S4 are not involved in this 
conservation relation, and are hence free species. 


Species Chart 

51 = Goff 

52 = Gon 

53 =M 

54 = P 


Table 1: Species chart for Gene-expression networks in Section [ 6 T] Gon and Gos denote the gene in active 
and inactive states. M denotes the mRNA and P denotes the corresponding Protein. 


Using Algorithms and we can identify the decomposed state-space and the irreducible state-spaces 
for Network 0 (see Tablej^. One can verify that the state-space = {(1, 0), (0,1)} for bounded species is a 
closed communication class in C(0) and so the Birth-Cascade Tree (BCT) for this network will only consist 
of one path whose nodes give us the birth-cascades of free species (see Remark 4.5). It is easy to see that 


the length of this path is 3, the first birth-cascade is {S3} and the second one is {S4}. Observe that these 
two cascades correspond naturally to the two stages in the network, transcription and translation, thereby 
suggesting that our cascade construction approach be a useful tool for understanding the structure of large 
Systems Biology networks. This point is reinforced by examples in Sections | 6 . 2 [ and |6.3[ We can check that 
the only leaf node of BCT is death-exhaustive and recover the death-cascades for the network (see Remark 
4.5). In Network 0, both free species degrade spontaneously and so they are both singularly-degradable 
w.r.t. any set. Other conditions of Theorem |4.4| also hold and so we can use this result to conclude that the 
full state-space x Nq is the unique irreducible state-space for Network 0. This concludes our state-space 
analysis for Network 0. 

We now create Networks 1-4 by adding feedback from protein molecules (see |3Z]) to various reactions 
in Network 0 (see Figure [T]A). This feedback is added to reaction k by multiplying its original mass-action 
propensity function Xk{x) by a Hill-type factor of the form 


dih- 


U/4 


(6.62) 


where X 4 is the number of protein molecules, and 0fb,c and n are strictly positive parameters. Note that 
Xkix) = 0 if 0:4 = 0 which means that if we just multiply the propensity function for reaction k in Network 


0 by this factor (6.62), then the modified network will not satisfy Assumption 2.1 which is required for our 
analysis. However we can circumvent this problem by simply adding a molecule of species S4 to both sides 
of reaction k, changing it from A — H to A -|- S4 —> H -|- S4. This simple trick ensures that the modified 


network satisfies Assumption |2.1| and its dynamics remains unaffected as the reaction stoichiometry is the 
same. Incorporating the feedback mechanism this way, we list the reactions for Networks 1-4 in Figure 
and provide the results from the state-space analysis of these networks in Table Observe that in each of 
these cases a unique irreducible state-space is guaranteed by Theorem |4.4[ but this irreducible state-space 
varies among networks. Also note that in all the Networks 0-4, the reactions stoichiometries are the same 
and consequently their decomposed state-space is identical. 

We now consider another gene-expression model (see Network 5 in Figure[^ by allowing the transcription 
of mRNA in the inactive gene state Goff, and having feedback from protein molecules to the translation 
reaction as well as the gene-switching reactions. In this network also the decomposed state-space is same as 
before but there are exactly two irreducible state-spaces, {( 0 , 1 )} x No x { 0 } and {( 1 , 0 )} x No x { 0 } according 
to Theorem 4.4 (see Table [^. 

We end this section by remarking that if all the Hill-type feedback factors have a positive basal level, i.e. 
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Network 0 



0 


0 


Off 




)a 


0 


0 



0 


0 


B 


Network 0 

No. 

Reaction 

No. 

Reaction 

1 

Si —>■ S2 

4 

S3 —> S3 + S4 

2 

S2 —>■ Si 

5 

S3 ^0 

3 

S2 —)■ S2 + S3 

6 

S4 ^0 


Network 1 

No. 

Reaction 

No. 

Reaction 

1 

Si + S4 —> S2 + S4 

4 

S3 —S3 + S4 

2 

S2 —> Si 

5 

S3 ^0 

3 

S2 —> S2 + S3 

6 

S 4^0 


Network 2 

No. 

Reaction 

No. 

Reaction 

1 

Si —> S2 

4 

S3 —> S3 + S4 

2 

S2 + S4 —> Si + S4 

5 

S3 ^0 

3 

S2 —^ S2 + S3 

6 

S4 ^0 


Network 3 

No. 

Reaction 

No. 

Reaction 

1 

Si —> S2 

4 

S3 —> S3 + S4 

2 

S2 —> Si 

5 

S3 ^0 

3 

S2 + S4 —> S2 + S3 + S4 

6 

S4 —>0 


Network 4 

No. 

Reaction 

No. 

Reaction 

1 

Si —> S2 

4 

S3 + S4 —S3 + 2 S 4 

2 

S2 —> Si 

5 

S 3^0 

3 

S2 —> S2 + S3 

6 

S4 —>0 


Network 5 

No. 

Reaction 

No. 

Reaction 

1 

Si —>■ S2 

5 

S3 + S4 —> S3 + 2 S 4 

2 

S2 —^ Si 

6 

S 3-^0 

3 

S2 —> S2 + S3 

7 

S 4-^0 

4 

Si —> Si + S4 




Figure 1: The Gene-expression networks in Section 6.1 The networks are depicted in panel A and the 


corresponding reactions sets are tabulated in panel B. The feedback interactions are shown with dotted 
arrows and for each of the Networks 1-4, the corresponding feedback interaction is clearly marked in the 
middle figure of panel A. 


instead of (6.62) we have 


0i + Ofh 


X4 


c -I- 0:4 


for some 9i > 0, then we do not need to add species S 4 to both sides of feedback reactions to ensure that 
the modified networks satisfy Assumption |2.1| Indeed now Networks 1-4 will satisfy this assumption with 
the original set of reactions (same as Network 0) and hence they all will have the full state-space x Ng as 
their unique irreducible state-space. The same holds true for Network 5 even though it has an extra reaction. 
This illustrates the significance of Assumption |2.1| in determining the irreducible state-spaces. 
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Network 

Decomposed State-Space 

Irreducible state-spaces 

0 

£b X 

£t X Ng 

1 

X Ng 

{(1,0,0,0)} 

2 

X Ng 

4 " X 

3 

£b X Ng 

4" X 1(0,0)} 

4 

£b X Ng 

X No X {0} 

5 


1(0,1)} X No X {0} and 
1(1,0)} X No xjO} 


Table 2: State-space analysis for Gene-expression networks in Section 6.1 All results reported w.r.t the 
identity permutation a = (1, 2, 3,4) and £§ = {(1, 0), (0,1)}. For all the five networks all irreducible state- 
spaces were identified by Theorem |4.4[ 


6.2 Two Circadian Clock models 

Circadian clocks are molecular time-keeping devices that coordinate many physiological processes in living 
organisms [38]. These clocks generate oscillatory rhythms that are usually entrained to the periodic cues 
provided by the day-light cycles [Ml iO]. We consider two circadian clock networks in this section and 
prove using our analysis that there exists a unique irreducible state-space for both these models, thereby 
indicating that the stationary distributions for the associated stochastic models is unique. The existence of 
these stationary distribution can be checked using the techniques in [21] and hence the stochastic models for 
both these networks are ergodic (see Section [^. Therefore under constant inputs the individual stochastic 
trajectory of a single circadian clock may be oscillatory, but the mean trajectories, corresponding to the 
bulk or population-level behavior of several uncoupled and identical circadian clocks, cannot be oscillatory 
due to (1.17). This is consistent with both computational [Mllll] and experimental mi observations in the 


existing literature. In a recent paper it is argued that this loss of oscillatory activity at the population-level 
plays an important in ensuring that the entrainment to periodic cues is robust at the population-level |42j . 

The first circadian clock model we consider is from Vilar et al. [5] and it is depicted in Figure |A. It 
consists of gene-expression modules for an activator protein A and a repressor protein R which sequesters 
the activator protein A by forming an inactive complex AR. The activator protein A can enhance the 
transcription of both the mRNAs {Ma and Mr) by binding to the promoter regions of the activator gene 
Da and the repressor gene D^. When the promoter region is occupied the activator and the repressor 
genes are denoted by D'j^ and D'^ respectively. The overall network consists of 9 species (see the Species 
Chart in Table and 16 reactions (see Table |^. The results from our state-space analysis on this network 
are presented in Figure [2p and they show that the network has a unique irreducible state-space which 
coincides with its decomposed state-space. For this network the situation of Remark |4.5| applies and so 
we can arrange all the free species into birth and death cascades (Figure [^). These cascades correspond 
naturally to different stages in the network. 


Species Chart 

Si =Ma 

S6 = G$i 

Sa = A 

Sr = Gl 

S3 = Mr 

S8 = G^ 

S4 = A 

S, = Gl 

85= AR 


Table 3: Species chart for the first Circadian clock model (Vilar et al. [^). A and R denote the activator 
and repressor proteins. Ga (Ma) and Gr {Mr) denote the gene (mRNA) corresponding to these proteins. 
The superscript 6 or u indicates the bound or unbound from of the gene. AR denotes the dimeric complex 
between A and R. 
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No. 

Reaction 

No. 

Reaction 

1 

Se + S2 ^ St 

9 

S2 ^0 

2 

Sy^Se + Sa 

10 

Sg ^ Sg + S3 

3 

Ss + S2 Sg 

11 

Ss ^ Sg + S3 

4 

Sg ^ Ss -p S2 

12 

S3 ^0 

5 

St ^ S7 + S1 

13 

S3 ^ S3 + S4 

6 

Se ^ Se + Si 

14 

S4 ^0 

7 

Si ^0 

15 

S2 -f S4 S5 

8 

Si ^ Si -1- S2 

16 

Ss ^S 4 


Table 4: Reactions for the first Circadian clock model (Vilar et al. [5]). These reactions follow the Species 
Chart in Table The network is depicted in Figure 

Next we examine the detailed mammalian circadian clock given in [13] (see FigurejsjA.). It consists of three 
gene-expression modules corresponding to Per, Cry and Email genes. The proteins created by these genes 
participate in complex regulatory steps through various mechanisms such as formation of dimeric complexes, 
translocation in an out of nucleus and undergoing phosphorylation-dephosphorylation cycles. Overall the 
network consists of 16 species (see the Species Chart in Table[^ and 52 reactions (see Table[^. Interestingly 
there are no conservation relations for this network and so all the species are free and Theorem |4.4| proves 
that Nj® is the unique irreducible state-space (see Figure [^. The birth and death cascades, displayed in 
Figure]^, signify the various network stages as before. 


Species Chart 

Si = Mp 

Sg = PCm 

Sg = Me 

Sio = PCcp 

S3 = Mb 

Sii = PCnp 

S4 = Pc 

S12 = Be 

85 = Cc 

S13 = Bep 

Se = Pcp 

Si 4 = Ex 

S7 = Ccp 

Si 5 = Bxp 

Ss = PCc 

S16 = In 


Table 5: Species chart for the second Circadian clock model (Leloup and Goldbeter |13|). P, C and B denote 
the Per, Cry and Email proteins. The dimeric complex between two proteins X and Y is denoted by XY. 
The subscripts C, N and P stand for “Cytosol”, “Nucleus” and “Phosphorylated” respectively. Mx denotes 
the mRNA for protein X and Ix is an inactive trimer in the Nucleus. 


6.3 Bacterial Heat-Shock response model 

We now analyze the bacterial heat-shock response model developed in Kurata et al. jUj. It is known 
that when bacterial cells are exposed to thermal shocks, certain cellular structures are damaged, causing 
some proteins to denature or unfold. Accumulation of these denatured proteins within cells can disrupt 
their normal functioning and hence a regulatory mechanism has evolved to detect and repair unfolded 
proteins. The model in |44] describes this mechanism in bacterium E. Coli and it consists of 5 distinct 
gene-expression modules responsible for creating various heat-shock specific proteins {hsps), such as the 
heat-shock transcription factor factor, proteases like HslVu, FtsH etc. and a molecular chaperon DnaK, 
which assists in the refolding of denatured proteins. The sigma factor competes with the dominant 
sigma factor to bind to the RNA Polymerase (RNAP), and when it is able to bind, it can initiate the 
transcription of the heat-shock gene ph, which then produces the other heat-shock proteins. The sigma 
factor is itself produced by the gene pg when RNAP is bound to sigma factor (t^°. As explained in 
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0 


0 



Ma 


Mr 


0 


0 


Activator gene 

* Promoter regions 


Repressor gene 


B State-space analysis 


Conservation Relations 


No. 

Quantity 

R,elation 

1 

2 

Activator gene 
Repressor gene 

Xs + XT = Cl 
xs + xa = C2 


Permutation: 

fj - {6,7,8,9,1,3,2,4,5). 

Decomposed state-space for (ci,C 2 ) = {1,1)- 

£o = X Ng. 

with - {(0,1,0,1), (0,1,1,0), (1,0,0,1), (1,0,1,0)}. 

All irreducible state-spaces in 

£i; X Ng 


Birth Cascades 


Transcription of 


Transiation of 


Formation of 


0 



Death Cascades 


Conversion 


Ss 


1 

Si, S2, 
S3, S4 


Spontaneous 

degradation 


0 


Figure 2: Results for the Circadian clock model (Vilar et al. [S]) depicted in panel A. The results of 
state-space analysis are presented in panel B and the birth-death cascades are shown in panel C. The birth- 
cascades signify various stages in the network. Note that in this network there are four bounded species 
(tT(l),..., cr(4)) and five free species ((t( 5), ..., cr(9)). 


[33] the heat-shock proteins interact in complex ways to realize both feedback and feedforward control that 
confers several performance advantages to the heat-shock response mechanism. 

The network underlying the model in |44] is shown in Figure |4|A and it is fairly large, with 28 species 
(see the Species Chart in Table and 61 reactions (see Table The network contains 5 semi-positive 
conservation relations corresponding to various conserved quantities (see Figure |4^). Due to these relations 
12 species are bounded while the rest are free. Using our analysis we can the find the decomposed state-space 
for this network and verify the existence of a unique irreducible state-space (see Figure |4^). Note that in 
this example, the irreducible state-space is a strict subset of the full state-space and in particular, the last 
two components of each state in this irreducible state-space are zeros. These two components being zero 


27 















































No. 

Reaction 

No. 

Reaction 

1 

0-^Si 

27 

Sg-^Sg 

2 

Si ^0 

28 

S8-^0 

3 

Si ^0 

29 

Sg ^ Sii 

4 

0-^S2 

30 

Sll ^ Sg 

5 

S 2 ^0 

31 

Sg -f Si4 - > S 16 

6 

S 2 ^0 

32 

S 16 - > Sg -f Si4 

7 

0-^S3 

33 

Sg^0 

8 

S 3 ^0 

34 

SlO^0 

9 

S 3 ^0 

35 

Sio ^0 

10 

Si ^Si + S4 

36 

Sll ^0 

11 

S4^S6 

37 

Sll ^0 

12 

S6^S4 

38 

S 3 S 12 

13 

Sg ^ S 4 + S 5 

39 

S 12 ^ Si3 

14 

S 4 + S 5 ^ Sg 

40 

Sl3 ^ S 12 

15 

S 4 ^0 

41 

S 12 ^ Si4 

16 

S2^S2 + S5 

42 

Sl4 ^ S 12 

17 

St^Ss 

43 

S 12 ^0 

18 

Ss^Sy 

44 

Si3 ^0 

19 

85^0 

45 

Si3 ^0 

20 

Se ^0 

46 

Sl4 ^ Si5 

21 

Se ^0 

47 

Sl5 ^ Si4 

22 

S7^0 

48 

Si4^0 

23 

S7^0 

49 

Si5 ^0 

24 

Sg ^ Sio 

50 

Si5^0 

25 

Sio ^ Sg 

51 

S 16 ^0 

26 

Sg^Sg 

52 

S 16 ^0 


Table 6: Reactions for the second Circadian clock model (Leloup and Goldbeter [13]). These reactions follow 
the Species Chart in TableThe network is depicted in Figure]^ 


implies that the copy-number of unfolded protein molecules is zero. As these unfolded protein molecules 
can only leave the network by converting to a properly folded protein (via reaction 25 in Table , we can 
conclude that the heat-shock response network is such, that despite having noisy, stochastic dynamics, all 
the unfolded protein molecules will eventually get folded, irrespective of the initial count of these unfolded 
protein molecules. We base this assertion on the fact that starting from any initial state in the state-space, 
the Markovian reaction dynamics will eventually get trapped inside the unique irreducible state-space. What 
makes this assertion powerful is that it holds just due to the structure of the network, regardless of the values 
of the rate constants of the reactions. Also for this network we are in the situation of Remark 14.51 Hence 
we can arrange all the free species into birth and death cascades which correspond to various network stages 
(see Figure]^). 


6.4 A simple Toxin-Antitoxin network 

Many intracellular networks have reactions taking place on many different time-scales [29l |32|. It is well 
known that direct stochastic simulations of such networks, using Gillespie’s SSA [T^, is computationally 
infeasible because most of the resources are spent on simulating the “fast” reactions which are generally 
less important than the “slow” ones m- To remedy this problem, approximate simulation approaches 
have been developed that separate the time-scales by applying the quasi-stationary assumption (QSA) on 
the fast subcomponents of the network [29]. Under this assumption, the slow reactions are “switched- 
ofF’ and the dynamics of the slow species involved in these reactions is “frozen in time”, while the stable 
Markovian dynamics of the fast subnetwork relaxes to a unique stationary distribution (see (1.17)). Using 
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A 


B State-space analysis 



No conservation relations 


Permutation: 

(7= (1,2,3,4,5,12,6,7,13, 
8,14,9,10,15,11,16) 


Decomposed state-space: £q = Nj® 

All irreducible state-spaces in £S: K 


c 


Birth Cascades 


0 


■ Phosphorylation of 
Proteins 

■ Formation of PC complex 


Transcription of the 
three mRNAs 

Sl,S 2 

Translation of the 
three Proteins 

S4, Ss 

■ Nuclear translocation of B 
proteins 

Se, S 7 ,Si 3 I 


■_Sa.J 


I S12 


S 8 ,Sl 4 I 


• Nuclear translocation of 
PC complex 

• Phosphorylation of PC 
complex in cytosol 

• Phosphorylation of 
nuclear B proteins 


4 

S9, S10 
Sl 5 


Phosphorylation of PC complex 
in nucleus 

Formation of inactive dimer of 
PC complex and B proteins in 
nucleus 


5 

Sl1,Sl6 


Death Cascades 


1 

Si, S2, S3, S4 
Ss, Se, S7, Ss 
S9, S10, S11, S12 
Sl 3 , Sl 4 , Sl 5 , S16 


Spontaneous degradation 

- -0 


Figure 3: Results for the Circadian clock model (Leloup and Goldbeter [33]) depicted in panel A. The results 
of state-space analysis are presented in panel B and the birth-death cascades are shown in panel C. The 
birth-cascades signify various stages in the network. Note that in this network all the species are free as 
there are no conservation relations. 


this stationary distribution the propensities of the slow reactions can be estimated and the next slow reaction 
can be sampled. Applying QSA between every consecutive slow reaction events allows one to approximate the 
original dynamics without having the simulate the fast reactions, thereby reducing the overall computational 
effort drastically. 

For a successful application of QSA it is imperative to ascertain the stability or ergodicity of the fast 
subnetwork in real-time (i.e. during the simulation run), as the set of available fast reactions can depend 
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Species Chart 

Si = RNAP 

Si 5 = : DnaK 

S2 = 

S 16 = Punfoided 

S 3 = CT™ : RNAP 

S 17 = Punfoided • DnaK 

S 4 = 

S 18 = Protease 

S 5 = : RNAP 

Sig = : DnaK : Protease 

Se = D 

S 20 = HslVu 

S 7 = RNAP : D 

821 = : HslVu 

Sg = : RNAP : D 

S 22 = mRNA(DnaK) 

Sg = : RNAP 

S 23 = mRNA(Protease) 

Sio = pg 

S 24 = mRNA(HslVu) 

Sii = ph 

S 25 = mRNA(CT32) 

S 12 = cr™ : RNAP : pg 

S 26 = mRNA(FtsH) 

Si3 = 0-32 . RNAP : ph 

S 27 = FtsH 

Si 4 = DnaK 

S 28 = : DnaK : FtsH 


Table 7: Species chart for the Bacterial Heat-Shock response model (Kurata et al. [S]). Here the complex 
formed by binding two biomolecules A and B is denoted by H : B. The main players in this network are 
the heat-shock proteins DnaK, Protease, HslVu and FtsH, and the transcription factor which binds to 
the RNA Polymerase (RNAP) to initiate the production of the heat-shock proteins via the ph gene. This 
transcription factor is itself created by the gene pg when the transcription factor is bound to RNAP. 
Here D denotes the part of DNA that does not include these genes, and P is the protein of interest that 
needs to be converted from its denatured form Punfoided to its proper form Pfoided- The mRNA for protein 
X is denoted by mRNA(X). 


on the current state of the slow species. Moreover for efficiently estimating the slow reaction propensities it 
is useful to determine the exact time-varying support-sets of the stationary distributions. These tasks can 
be automatically accomplished by integrating our computational procedures for state-space analysis, with 
any QSA-based simulation algorithm like the slow-scale SSA or ssSSA |3D] or the Nested SSA [3T1I32]. We 
illustrate this next using a simple Toxin-Antitoxin network module which is found in many bacterial cells 
and is believed to lead to the formation of slow-growing persister cells that exhibit multi-drug tolerance. 
This example shows how restricted species arise naturally when we restrict our attention to a subnetwork 
within the larger network. Moreover this example also highlights that the flexibility in the classification of 
unbounded species as free or restricted (see Remark 3.21 can be really important for applications. In fact 
for this example, this classification will change randomly with time depending on the states visited by a 
stochastic trajectory. 

The simple Toxin-Antitoxin network we consider is depicted in Figure and it is based on the more 
detailed model given in [45]. It consists of a single DNA strand containing genes for both Toxin T and 
Antitoxin A protein molecules. Both these proteins are translated by a common biscistronic mRNA M and 
they both annihilate each other in the sense that they bind to form an inactive complex AT which does 
not participate in the dynamics. The Antitoxin molecules directly inhibit the transcription of mRNA and 
the Toxin molecules convert to a protein P that interferes with bacterial metabolism and harms the cells 
in various ways |46j . Following the Species Chart in Table we can describe our simple Toxin-Antitoxin 
system as a network with 8 reactions which are listed along with their propensity functions in Table |I0| The 
choice of rate constants in these propensity functions is arbitrary but reflective of the values found in the 
literature |3S]. From these rates it can be inferred that two reactions can be considered fast, namely, the 
translation of mRNA and the mutual annihilation of Toxin and Antitoxin proteins (see Figure [sjA). 

Suppose one is interested in simulating the stochastic model for this network in the time period [0, T] 
for T = 100. Performing exact stochastic simulations is cumbersome due to the presence of fast reactions, 
but an approximate QSA-based algorithm can be used for carrying out these simulations with far lesser 
computational effort. To apply QSA, we consider the dynamics of the component of the network consisting 
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No. 

Reaction 

No. 

Reaction 

1 

Si + S 2 S 3 

32 

S 18 ^0 

2 

S 3 Si + S 2 

33 

Sl9 ^ Si5 

3 

S 4 + Si S 5 

34 

S 12 - > S 25 -t- S 12 

4 

S 5 S 4 + Si 

35 

S 25 ^0 

5 

Si + Se Sj 

36 

S 25 —> S 4 -I- S 25 

6 

S 7 Si + Se 

37 

S4^0 

7 

S 4 -I- Si 4 - > Si 5 

38 

Sig -Sig -I- Si 4 

8 

S 15 - > S 4 -I- Si 4 

39 

S 21 - >■ S 20 

9 

S 16 -1- Si4 -S 17 

40 

S 28 - >■ Si4 -|- S 27 

10 

S 17 ->• S 16 -1- Si 4 

41 

S 13 -S 13 -I- S 24 

11 

S 3 + Se Sg 

42 

S24^0 

12 

Sg S 3 + Se 

43 

S 24 - >■ S 20 + S 24 

13 

S 5 + Se Sg 

44 

S 20 ^0 

14 

Sg S 5 + Sg 

45 

S 21 ^ S 4 

15 

S 3 + Sio ^ S 12 

46 

S 4 -|- S 20 —^ S 21 

16 

Sl2^S3 + Sio 

47 

S 21 —> S 4 -I- S 20 

17 

S 5 + Sii ^ Si3 

48 

S13 - > Si 3 -t- S26 

18 

Sl 3 ^S 5 + Sii 

49 

S26 ^0 

19 

Sl 5 -I- Sig - > Sig 

50 

S26 —>- S26 + S27 

20 

Sl 9 ->■ Si 5 -I- Sig 

51 

S 27^0 

21 

Si 3 ->■ Si 3 -I- S22 

52 

S28 ^ Si5 

22 

S22 — >■ 0 

53 

S15 + S27 ->■ S28 

23 

S22 ->■ Si 4 -|- S22 

54 

S28 - S15 -1- S27 

24 

Si 4 ^0 

55 

Ss^Si 

25 

Sl 7 - Si4 + Ffolded 

56 

Si 3 ^Si-fSii 

26 

Sl 5 ^ S 4 

57 

S9^S7 

27 

Sig ->■ S 4 - 1 - Sig 

58 

Sl 5 ^ Si 4 

28 

S28 - > S 4 -I- S27 

59 

Sig - > Si 4 -I- Sig 

29 

Sl 3 ^ Si 3 + S23 

60 

S28 - >■ Si4 + S27 

30 

31 

S23 ^0 

S23 —>■ Sig -I- S23 

61 

S21 -^ S20 


Table 8 : Reactions for the Bacterial Heat-Shock response model (Kurata et al. [S]). These reactions follow 
the Species Chart in Table The network is depicted in Figure |4]A.. Reaction 25 represents the refolding of 
denatured protein P when it is in complex with the chaperon protein DnaK. This reaction is the only way 
in which molecules of protein P exit the network. 


Species Chart 

to 

II II 

OJ 

II II 


Table 9: Species chart for the Toxin-Antitoxin model. T and A denote the Toxin and Antitoxin proteins 
respectively. They are both translated by the biscistronic mRNA M which is produced constitutively by a 
gene. The Toxin protein converts to another protein P which inhibits the metabolism of the cell. 


of two fast reactions 


Si ^ Si -f S 2 -f S 3 and S 2 + S 3 ^ 0, (6.63) 

assuming that all the other slow reactions are switched off and the state of the slow variables is fixed. 
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B State-space analysis 


Conservation Relations 


No. 

Qiimitity 

Relation 

1 

RNAP 

+ 2:3 -1- 15 -b i ;7 + Jg -H j’n + X12 + Xis = Ci 

2 


+ 2:12 = C2 

3 

D 

xa + xt + xg -b xq = C3 

4 

Pg 

^10 + ^'12 = <^4 

5 

ph 

J:ii + 3^13 = C5 


Permutation: 


7 = (1,2.3,5,6.7,8,9.10,11,12,13,25,4,22, 

23.24,26,14,18,20,27,15,21,19,28,16,17) 


Decomposed state-space for (ci, cg,C 4,C5) = (1.1,1,1,1): 
^q^ = CxNJ^x{(0.0)}, 


( 1 , 1 , 0 , 0 , 1 . 0 . 0 , 0 , 1 , 1 , 0 , 0 ) 
( 0 , 0 , 1 , 0 , 1 , 0 . 0 , 0 , 1 . 1 , 0 , 0 ) 
( 0 , 1 . 0 , 1 , 1 , 0 . 0 . 0 , 1 , 1 . 0 , 0 ) 
( 0 , 1 . 0 , 0 , 0 , 1 . 0 . 0 , 1 , 1 . 0 , 0 ) 
( 0 , 0 , 0 . 0 , 0 , 0 . 1 . 0 , 1 , 1 . 0 , 0 ) 
( 0 , 1 , 0 . 0 , 0 , 0 , 0 . 1 . 1 , 1 . 0 , 0 ) 
( 0 , 0 , 0 . 0 , 1 , 0 , 0 . 0 . 0 , 1 , 1 , 0 ) 
( 0 . 1 , 0 , 0 . 1 , 0 , 0 , 0 . 1 , 0 , 0 . 1 ) 


All irreducible state-spaces in 

CxNj^x{(0,0)} 


Birth Cascades 


0 



Death Cascades 



0 


Figure 4: Results for the Bacterial Heat-Shock response model (Kurata et al. [33]) depicted in panel A. 
The mRNAs are not labeled to avoid clutter. The results of state-space analysis are presented in panel 
B and the birth-death cascades are shown in panel C. The birth-cascades signify various stages in the 
network. Note that in this network there are twelve bounded species (cr(l),..., (t( 12)) and fourteen free 
species (ct(13), ..., a(26)). 


Following |29j . the slow variables in this network are Xi (mRNA copy-number) and y = X 2 — X 3 (difference 
between Toxin and Antotoxin copy-numbers), because these variables are unaffected by the fast reactions in 
(6.63). Fixing Xi and y we compute the decomposed state-space for this subnetwork using Algorithm]^ Due 
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No. 

1 

2 

3 

4 

5 

6 

7 

8 


Reaction 
0 ^ Si 

Si-^Si + S2 + S 
S 2 + S 3 0 



Propensity 

Ai(a:) = 

3 ^2(0;) = 6 »ia:i 

^ 3 ( 3 ;) = 6>2X2X3 
A4(a;) = 2a::i 
A 5 (a;) = 5a;2 
A 6 (x) = X 3 
Xjix) = 0.5a;2 
A8(a:) = 0.1a;4 


Table 10: Reactions for the Toxin-Antitoxin model according to the Species Chart in Table[^ The associated 
propensity functions A^-s are also provided. Here Xi denotes the copy-number of species S.^. We choose 
9i — 100 and 02 = 10, and hence reactions 2 and 3 are much faster in comparison to the other reactions. 
Reaction 3 is the annihilation reaction between Toxin and Antitoxin molecules, which actually represents 
formation of an inactive complex AT. In reactions 2 and 3, we choose parameters 0i = 100 and 9 = 10, and 
so the subnetwork formed by these two reactions can be considered fast in the context of the larger network. 


to the semi-positive conservation relation 71 = ( 1 , 0 , 0 ), the species Si is bounded and its finite state-space 
is simply the singleton {a:i}. The only other independent conservation relation is 72 = (0,1,—1), which 
is mixed-sign, and so there exists a restricted species. Algorithm will automatically classify one of the 
remaining species as free and the other as restricted, depending on the value of y, in such a way that the 


associated affine map (j) is compatible with the network (recall Assumption 4.1). If ?/ > 0, then Algorithm 
picks species S 3 as free, species S 2 as restricted, the affine map (j) as 4 >{x 2 ) = x^ + y and the permutation 
cr as cr = (1, 3, 2). However these choices violate network compatibility when y < 0 and so in this situation 
Algorithm picks species S 2 is free, species S 3 is restricted, the affine map 4> as (ji^xs) = X 2 — y and the 
permutation a as a = (1,2, 3). For convenience, let S^, (fy and ay denote the y-dependent choices of the free 
species, the affine map and the permutation a respectively. Then according to Algorithm [T] the decomposed 


state-space for the fast subnetwork permuted with ay is simply {a;i} x 4)^, where is the graph (see (1.15)) 
of map (f)y 


As described in Section 4.1 we can reduce the fast subnetwork (6.63) by eliminating the restricted species 


to obtain the following network 


Si 


Si -|- Sy and S^ 


(6.64) 


where the propensity function for the first reaction is same as before, but for the second reaction it changes 
to 


Ay(xi,z) = 92Z(j)y{z) = 92 z{z + 


(6.65) 


Here z is the copy-number of the free species Sy. Observe that the compatibility of ma p 4>y ensures that 
Xy{xi,z) = 0 if and only if z = 0, and hence the reduced network satisfies Assumption 


2.1 


This allows 

us to apply Algorithm to find all the irreducible state-spaces for this reduced network in the state-space 
{xi} X No and this algorithm outputs that there is a single irreducible state-space which is identical to the 
full state-space {xi} x Nq. Therefore the stationary distribution for this network must be exactly supported 
on {xi} X No, and in fact using the results in [35] (see also Section 6.5) we can compute it to be exactly: 


7ry(xi,z) = 


M1 


liX) (^!(^-S|y|)! 
0 


if xi = 0 and z = 0 

if Xi > 0 and z = 0 , 1 , 2 ,... 

otherwise, 


( 6 . 66 ) 


where a! denotes the factorial of a and L is the normalization constant given by 

00 


2=0 


z\{z+\y\)\ 
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Promoter/operator Antitoxin gene Toxin gene 

region 


B 



Exact SSA 


ssSSA - QSA 




Values 


Values 


Figure 5: Stochastic simulations for the Toxin-Antitoxin model depicted in panel A. The fast reactions 
are indicated with a thicker arrow. In panel B the estimated mean-dynamics E(A 4 (t)) for copy-numbers of 
protein P is plotted, and in panel C, the estimated probability distributions (histograms) for these copy- 
numbers at the final time-point T = 100 are shown. All these estimations were performed with two simulation 
schemes - the Exact SSA m and the approximate ssSSA [30] that uses the quasi-stationary assumption 
(QSA). Observe that ssSSA is quite accurate and for this example, the simulations using ssSSA were about 
6 times faster than those using the Exact SSA. 


It is easy to check that L < oo and hence is a valid stationary distribution over {xi} x Nq. The uniqueness 
of this stationary distribution is guaranteed because {xi} x Nq is the only irreducible state-space for the 
reduced network. Due to Proposition 4.2 we can conclude that {a;i} x is the only irreducible state-space 


for the full subnetwork (6.631 under permutation ay and its unique stationary distribution tTj, on {cci} x 
is just 


TTy{xi,z,(t>y{z)) = Try{xi,z). 

This also shows that the subnetwork (6.63|) is ergodic which is necessary for the application of QSA. For the 
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slow reactions k G {1,4, 5, 6, 7,8} the propensities needed for the application of QSA can be estimated as 

Xk{xi,y,X4) = ^ Xkixi, z, (j)y{z),X 2 )^yixi, z). 

zGNo 


Using these propensities we can simulate the Toxin-Antitoxin network with ssSSA [3D] to estimate the 
probability distribution of copy-numbers of protein P (i.e. X^lt)) as well as the mean dynamics of these copy- 
numbers. The results are reported in Figure]^ where results from the exact SSA simulations are also shown 
for comparison. One can see that ssSSA simulations are quite accurate thanks to the correct identification 
of the ergodic subnetwork and its stationary distribution. Moreover these ssSSA simulations are about 6 
times faster than simulations with Exact SSA. This simple example nicely illustrates how our algorithms for 
state-space analysis can be integrated with a simulation scheme like ssSSA, to aid the application of QSA. 
Note that in more complicated examples, the exact form of the stationary distribution may not be available, 
but our analysis can verify its uniqueness and provide a description of its exact support. In such cases this 
knowledge can be combined with other simulation-based schemes to sample from the stationary distributions 
and estimate the slow propensities for applying QSA (see mm)- 


6.5 Networks with product-form stationary distributions 


In this paper we present a method for finding irreducible state-spaces for networks where infinitely many 
states are accessible. On each of these irreducible state-spaces, the uniqueness of the stationary distribution 
is guaranteed but its existence needs to be checked by other means (like the analysis in | 21 |)- This is 
complementary to certain other results in the literature which assume the knowledge of irreducible state- 
spaces and demonstrate the existence of product-form stationary distributions for a large class of networks 
[26j . Exploiting this complementarity, we now explore how the combination of our results with the results 
on product-form stationary distributions, can provide us with a complete characterization of the simplex of 
stationary distributions (see ( jl.lG l) for several networks. 

Consider a reaction network Af = (y,0,A) with d species and K reactions of the f orm (|1.I[ ) (see Section 
[^. For now we assume that each propensity function has the mass-action form ( 2.2I[ ) with some rate 
constant 9k > 0. In the deterministic setting, the state of the network at time t is a vector of species 
concentrations x{t) = {xi{t ),..., Xd{t)) G which evolves according to the following ODE 


^ ='^0k(Y[xi''^\ (pk-i^k), (6.67) 

k=l \i=l / 

where i^k = {vik, • ■ •, Vdk) and pk = (pi/c, • ■ •, Pdk) are vectors in Nq denoting the fc-th column of matrices V 
and O respectively. Observe that the fc-th reaction in the network can simply be represented as Vk —^ Pk- 
With this association, Vk and pk are called network complexes. Let C = {vk, Pk : fc = I,..., AT} be the set of 
all network complexes and suppose there exists a strictly positive vector r = (ri,..., r^) G such that for 
each z G C we have 


K 





K 

^ Ok 

k=l,pk=z 



This relation simply says that when the vector of species concentrations is r, the rate at which a complex 
z is consumed (l.h.s.) is same as the rate at which this complex is produced (r.h.s.). It can be shown 
that r is a fixed point for the deterministic state-dynamics (6.67), i.e. r.h.s of ( 6.67| ) is 0 when x = r. 
Hence the network is called complex-balanced and r is called a complex-balanced fixed point. This property 
has several implications regarding the topology of the network as well as the existence, uniqueness and 
stability of its fixed points (see [4711181149]). The existence of a complex-balanced equilibrium can be verified 
computationally, but it can also be checked for many networks using the famous deficiency zero theorem (see 
[ID]) from Chemical Reaction Network Theory. 

In [26] . Anderson et al. prove that if a network Af with mass-action propensities is complex-balanced 
in the sense described above, then there exists a product-form stationary distribution tt for its stochastic 
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model, on each irreducible state-space 8 within the state-space 8 q C Nq of the network. This stationary 
distribution is given by 


1 ^ik 

7t(x) = TT for all X G S, (6.68) 

M xA 

1—1 

where a\ denotes the factorial of a, r is the complex-balanced fixed point and M is the normalizing constant 
given by 


_ _ 

""Sn-rr- 


(6.69) 


x^Si —1 


To use this elegant result one needs to know the irreducible state-spaces, which is precisely what is ac¬ 
complished in this paper. In fact, the examples considered before suggest that for many networks we can 


provably find all the irreducible state-spaces £i,... ,£q for the network using our main result Theorem 4.4 


If this is true, then we can easily compute the corresponding product-from (6.68) stationary distributions 


supported on these classes and obtain the exact simplex S (see (1.16)) of stationary distributions for the 
network. 

Note that if the whole nonnegative integer orthant f = Nq is an irreducible state-space, then this is the 
only irreducible state-space, and the simplex S consists of just one stationary distribution tt given by (6.68) 


with £■ = Nq and M = exp(^.^^ r^). In other words, tt is just a product of Poisson distributions and in such 
a scenario, the species copy-numbers are independent at stationarity, with the copy-number distribution of 
species i being Poisson with mean r^. As the species are constantly interacting through reactions, having this 
independence is quite remarkable, and it has been argued that this independence could play an important 
role in metabolic pathways [50] . 

We now demonstrate how for certain networks, our results can help in accurately computing the stationary 
networks by replacing a possibly infinite sum in (6.69) with a hnite sum, thereby avoiding any truncation 


errors associated with the problem of estimating inhnite sums. Assume that network M is complex-balanced 
and it does not have any restricted specie^ Also suppose that using Theorem 4A we obtain the irreducible 


state-space 8 = 8b x Nq x {0}, under some permutation a which we we can assume to be identity (i.e. 
cr(*) = i for each i), without any loss of generality. Here 8 b is a finite set in Nq , db is the number of hounded 
species and di is some number less than the number of free species df = (d — db). The decomposed form 
of 8 shows that any element x G 8 can be expressed as a; = {xb,Xf,0), where Xb = {xi,... ,Xdi) G 8b and 


Xf = {xd^+i,... ,Xd^+di) G Ngb Therefore we can write (6.69) as 


d 


M = 




^ n 

{xb,Xf ,0 )gS i—1 


' i 

xA 



I 


E 




n 

i—db-\-l 


xA 


= exp 



which is a hnite sum, that can be easily computed without incurring any truncation errors. In deriving the 
last relation, we have used that J2^=o = exp(a) for any a e K. 

Until now we were assuming that propensity functions of the complex-balanced network Af satisfy mass- 
action kinetics. Theorem 6.1 in |26| relaxes this assumption of mass-action kinetics and proves the existence 
of product-form stationary distributions for more general kinetics, where each species i has a “rate of asso¬ 
ciation” function : Nq —>■ K+. In this setting, the mass-action formula (2.21) for the propensity function 
changes to 


d Uik — l 

Afe(a;)=6ifeJ| Ki{x, - j) 

1^1 j^O 

^This does not cause any loss of generality as we can always remove the restricted species and obtain a reduced network as 
described in Section |4.1| 
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and the product-form stationary distribution becomes 


=M n 




for all X G £, 


with the normalizing constant M chosen to ensure that ~ Observe that this result can be 

used to compute the stationary distribution (6.661 for network ( |6.64[ ) in Section 6.4 Recently this result 
has been extended even further to consider more general species-specific association rate functions (see [5T]'I. 
Such results along with our computational framework for identifying irreducible state-spaces, provide a way 
for characterizing the stationary distributions for complex-balanced networks. 


7 Proof of Theorem 4.4 


Let Lmin be the set of al minimal leaf nodes of the Birth-Cascade Tree (BCT) constructed in Section 4.4 
We pick any {C,A) G Lmin and consider the dynamics of network Af°' under the permutation a — 


mentioned in the statement of Theorem 4.4 Consider the directed path (4.60) in BCT, starting with the node 


(Co, 0) and terminating at the leaf node (C, A). The next lemma is a simple consequence of the construction 
of BCT. 


Lemma 7.1 

and 


Pick any Zi G Cg, Z 2 G C and rg G Then there exists a vector x G Nq'^' such that x > rg 


|A| 


AT' 


(2:1,0) -^ ( Z2 , X , 0 ). 


Moreover this relation also holds for any zi G C. 




by 


Let Ci-s and G^-s be as in the path (4.60) 


Proof. Throughout this proof we denote the relation 
consisting of {I + 1) nodes. Let di = |Gi| for each i and since G^-s in an increasing family of sets we must 
have 


0 = do < (^1 < (^2 < ■ • • < di = 1^1- 

By permuting the network if necessary, we can assume that Gi = {1,..., d^} for each i. 

Clearly the case ^ = 0 is trivial because if this case Gq = G and A = %. We now consider the case I = 1 
and A = Gi- Due to reactions in the set /Cj.(0), the dynamics of bounded species can reach any state in 
Go from any other state in Gg. The construction of BCT ensures that for each free species in 'D'^{A), 
there exists a state y G Gg and a reaction /Cj.(?/, 0) C /Cr(0), which produces this free species. By repeated 
brings of this reaction we can push the molecular-count of species beyond any positive integer. Note that 
reactions in /Cr(0) do not consume any free species. Hence we can perform this procedure independently for 
all the free species in Vj^A), and prove that for any r G Ng^ there exists a vector x' G Ng^ along with some 
state y G Gg such that x’ >r and 


(zi,0) — {y,x',Q). 

However since G G '^((i,Ai)(C^o)j after finitely many reactions in K.r(A) that only consume the free species in 
'Dj{A), the state of the bounded species can go from y to Z 2 G G, thereby ensuring that 

iy,x',0) —)■ {z 2 ,x, 0 ) 


for some x G Ng^ with x < x'. By choosing r with large enough entries we can ensure that x > rg. The 
assertion of this lemma (i.e. (zi, 0) — {z 2 , x, 0)) then follows for I = 1 from the transitivity of relation —>■ 
(see Section 2.1). 

For a general I > 1, one can repeat the above arguments at each stage i = 1, 2,..., (Z — 1) to prove the 
lemma’s assertion. At each stage we rely on the fact that G^ is a closed communication class under reactions 
in K.r{Gi) and these reactions do not consume any free species outside the set 'Dj{Gi). 
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Now suppose Zi S C. After finitely many reactions in the set /Cr(0), the state of bounded species will 
reach a state z'^ in some closed communication class C'q S C(0). As Zi S C and /Cj.(0) C JCr{A) we must have 
Cq C C. There exists a BCT-path of the form 

(C',0) = (C',G') ^ {C[,G[) ^ ^ {C'r^.G'^) = {G',A% 

culminating in the leaf node {G',A') S L. Note that as Gq C G we have Ar(GQ,0) C ICr{G,A) and hence 
G'l C G and G'l C A. Repeating this argument (m — 1) times we can conclude that G!^ = G' C G and 
= A' C A. However as (G, A) is a minimal leaf node we must have G' = G and A' = A. The result now 
follows from the assertion already proven above. □ 

For the minimal leaf node (G, A) S Lmin, let N'^{G,A) be the network formed by restricting the free 
species to the set 'Dj{A), the reactions to the set ICr{G,A) and the bounded species state-space to the set 

G. Let Ld(G, A) denote the set of all leaf nodes for the Death-Cascade Tree (DCT) for network A/"”^(G, A). 
The following lemma pertains to the situation when the leaf node (G, A) is death-exhaustive. 

Lemma 7.2 Suppose that the minimal leaf node (G, A) € Lmin is death-exhaustive, i.e. there exists a node 
(G', A') e Ld(G, A) such that A = A'. Then G' = G and for any zi,Z2 G G and tq S there exists a 
vector X G N}/'' such that x > ro and 


{zx,x,G) 



( 22 , 0 ). 


Proof. Let A/l^y(G, A) be the inverse of network M'^{G,A). Note that since (G, A) is a leaf node for 
BCT, relation ( |4.59 ) holds and hence the reactions in ICr{G,A) do not produce any free species outside the 
set I?j(A). This implies that in the inverse network A/'j^.^,(G, A) no reaction in JCr{G,A) can consume any 
free species outside the set T>J{A). Hence all the reactions in ICr{G,A) can fire in this inverse network as it 
satisfies Assumption |2.1[ Since communication is a symmetric relation and G' C G is a closed communication 
class, A' = A implies that G' = G. By definition, the DCT for network A/'‘^(G, A) is the BCT of network 


N-f^,j{G,A). Consider the directed path (4.601 in this BCT, starting with th e no de (Go,0) and terminating 
at the leaf node (G, A). Pick any zi,Z 2 G G and rg S Applying Lemma 
exists a vector x G such that a: > rg and 


7.1 


we can conclude that there 


(22,0; 




{zi,x,0), 


which also implies 




(zi,a:,0) —( 22 , 0 ), 


due to relation (2.231 and the fact that network N'^{G,A) is simply a restriction of the network 


completes the proof of this lemma. 


This 

□ 


Lemma 7.3 Let (G, A) G L„iin be a minimal leaf node such that all the free species inVj^A) are singularly- 

degradable w.r.t. A. Then there exists a rg G such that for any z G G and xi,X 2 G satisfying 
xi > X 2 > rg we have 


{z,xi,0) ^ {Z,X 2 , 0 ). 

Proof. Throughout this proof we denote the relation — > by — >. Without loss of generality we can assume 
that A = {1, 2,..., |A|}. The proof of this lemma is inspired by the proof of Theorem 3.4 in [^. Fixing a 
z G G, we first show that for each i G A there exists a ri G Nj/'' such that 

{z,r„0) — > {z,ri-ei,0), (7.70) 
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where is the i-th standard basis vector in 

Pick any i G A and let ICp{A) be the set of reactions given by (4.61). Since each species in 'Dj{A) is 

singularly-degradable, there exists a sequence of reactions fci,..., S ICp{A) such that 


(-ei,0) = 

1=1 


(7.71) 


where and are as in Section 4.4 Note that since for each reaction kj we have supp(p|d C A^ we must 
also have supp(?'^.) C A^ or otherwise the last {df — |A|) components in the r.h.s. of ( |7.71 1 cannot be 0. 
Therefore each of these reactions can only consume the free species in T>J{A). For each m = 1,... ,n let 


m— 1 


= E (pI - ^ 


1=1 


By choosing a G with large enough entries we can ensure that (ri,0) + for each m. Since 

none of these reactions involve the bounded species and Assumption |2.1| is satisfied, such a choice of 
also ensures that each reaction km has a positive probability of firing when the state of the free species is 


(ri, 0) + ym, thereby implying that (7.70) is satisfied. 


We find such a for each i G A and compute the maximum rg := maxig^jri} of these vectors in the 

\A\ 

we can conclude that for any a; G Nq satisfying a; > rg we have 


componentwise sense. Using Proposition 


2.2 


iz,x, 0 ) —1 {z,x- ei, 0 ) 


(7.72) 


for each i G A. Now select any Xi,X 2 G satisfying Xi > X 2 > rg and let a = {xi — X 2 ) G We can 

express a as the sum 

l-4| 


Q — / Ol-iCi. 


Exploiting the transitivity of relation 

{z, X 2 + a, 0) — 1 {z, X 2 + a — aiCi, 0) — 1 (z, X 2 +a — oiCi — 0262 ,0) 

But a ;2 + a = a;i and {x 2 + a — X)l=i = 2^2 and hence the proof of this lemma is complete. 


and using (7.72), times for each i we obtain the accessibility chain 

{z,X2 + a-Y}i^^a^e^,0). 

□ 


Next we present a simple modification of the above lemma in the case where all the free species are 
singularly-degradable w.r.t. A. 


Lemma 7.4 Let {C,A) G Lmin he a minimal leaf node such that all the free species inDf are singularly- 

there 

satisfying x > rg we have 


degradable w.r.t. A. Then there exists a rg G Ng^' such that for any z G C, y G Ng^ and any x G Ng^' 


Al' 


{z,x,y) —1 {z,x,Q). 


7.3 


Proof. The proof is similar to the proof of Lemma 
assume that A = {1, 2, ..., |A|}. Fix a z G C. We first show that for each i 
a Ti G such that 


JJ'o- 

As before we denote the relation —1 by —> and 
G {|A| + 1,... ,df} there exists 


(z,ri,e,) —> {z,ri,0), (7.73) 

where is the 1 -th standard basis vector in 

Fix any i G {|A| -|- 1,..., and since all the free species AiT)f are singularly-degradable w.r.t. A, there 
exists a sequence of reactions ki,... ,kn G K.p{A) such that 

n 

(0,-ei) = -^fe,)- 

t=i 
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Note that for each reaction kj we have supp(p^^) C A. Hence there must exist a rig G {1,... ,n} such that 
supp(r'^.) C A for all j ^ uq and for reaction kng the following must be satisfied: 


For each m = 1,..., n let 


supp(i/fc„^) C A U {i} and = 1. 


m— 1 

Vm = y^(Pfc, 


J=1 

By choosing a G with large enough entries we can ensure that {ri,ei) + ym > for each m. 
Since none of these reactions involve the bounded species and Assumption |2.1| is satisfied, such a choice of 
ri also ensures that each reaction km has a positive probability of firing when the state of the free species is 
(?’i, Si) + Um, thereby implying that (7.73) is satisfied. 

We hnd such a ri for each i G {|A| + l,...,df} and compute the maximum rg := maxjri} in the 

\A\ 

we can conclude that for any a; G Ng satisfying x > rg and 


2.2 


componentwise sense. Using Proposition 
any y = {yi, .. .,ydf-\A\) G we have 


(z,x,y) —)■ {z,x,y-ei) 


(7.74) 


as long as j/i > 1. Exploiting the transitivity of relation —>■ and using ( 7.74[ ), yi times for each i we obtain 
the accessibility chain (z, x, y) —(z, x,y - yiei) —(z, x,y - yiCi - 1 / 262 ) —• • • —( 2 , x,y - yiCi - 

1/262- ydf-\A\edf-\A\)- But y-1/161-1/262- ydf-\A\edf-\A\ = 0 and hence the proof of this lemma 

is complete. □ 


We are now ready to prove Theorem 4.4 We shall use the fact that irreducible state-spaces for network 
A/’'^ must be necessarily disjo int (see Section 2.1). 

A", 


Proof. [Proof of Theorem 
leaf node (C, A) G L 


4.4 


Throughout the proof we denote the relation —> by —>. We pick any minimal 


and we suppose that this node is death-exhaustive and all the free species in 'D'j{A) 
are singularly-degradable w.r.t. A. We now show that C x x {0} is an irr educ ible state-space for network 
A/'®’ permuted according to the permutation cr = erg (A) dehned in Section 4.4 To prove this assertion it 

Ix4| ^^ 

suffices to show that for any zi, Z 2 G C and x G Ng we have 

(zi,0) -)■ (Z2,X,0) —^ ( 21 , 0 ). 


(7.75) 


This is because if (7.75) holds then for any (zi,xi, 0), (z 2 ,X 2 , 0) G C x Nq"^' x {0} we have (zi,xi,0) 


(z 2 ,X 2 , 0 ) due to the following chain of accessibility relations (zi,xi,0) 
transitivity of —>. 


(zi,0) —)■ (z 2 ,X 2 , 0 ) and the 


Lemma 


7.2 


We now prove (7.75) for a fixed zi,Z 2 G C and x G n}/^'. Since the leaf node (U, A) is death-exhaustive 


implies that for any rg G Ng"^' there exists a vector x' G n[(^' satisfying x' > rg and 

(Z2,x',0) -)■ (Z2,0). 


(7.76) 


We assume that rg is as in Lemma 7.3 Due to Proposition |2.2[ relation (7.76) implies (z 2 ,x -I- x',0) —^ 
(z2,x,0). Using Lemma [ 7 .l| we can find a vector y G Nj)^' satisfying y > (x -I- x') and (zi,0) —(z2,y,0). 
Since (x -I- x') > rg, Lemma |7l^ implies that (z 2 , y, 0) —(z 2 , x -I- x', 0). We now have an accessibility chain 
(zi,, 0) —> (z 2 , y, 0) — > (z 2 , X -I- x', 0) —> (,Z 2 ,x, 0), and since —> is transitive we have the first relation in 


(7.75). 


We now show the converse (i.e. (z2,x,0) 
xi,X 2 G Ng"^' such that Xi > X 2 > 0, (z2,X2,0) 


(zi,0)). Using Lemma [7A] and \l72\ we can find vectors 
(zi,0) and (z2,0) —(z2,Xi,0). The last relation also 


implies (z 2 , x, 0) —> (z 2 , Xi-|-x, 0) due to Proposition |2.2[ Since (xi-|-x) > X 2 > rg we have (z 2 , xi-|-x, 0) — 
(z 2 , X 2 , 0) due to Lemma [7(3| This gives us the following chain of accessibility relations (z 2 , x, 0) —(z 2 , Xi 
X, 0) —(z2,X2,0) —(zi,0) which shows the second relation of (7.75) and proves that C x 
an irreducible state-space for network A/"'^. 


0 X {0} is 
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Assuming that all the free species in the set T>f are singularly-degradable w.r.t. A, we now prove 
that this is the network’s only irreducible state-space that can contain elements in the set C x Ng^ under 
the permutation a = ns (A). We prove this claim by contradiction by showing that ii £ C £§ x Ng''^ is 
any irreducible state-space for network A/"'’’ satisfying £ H {C x Ng^) ^ 0 then it cannot be disjoint from 


C X 

by {z,x) with relation 


g X {0}. Let {z,x) be any state in £, and as this set is irreducible, any state that can be accessed 
, must also be in £. W e can write x sNq-^ as x = (xi,X 2 ) w here xi S N{j^' and 




X 2 € Ng^ Let xg G be as in Lemma 7.4 Using Lemma 7.1 and Proposition|2.2|if necessary, we can 
assume that (z,xi,X 2 ) —> {z,x'i,X 2 ) for some x'j^ > rg. Lemma 7.4 then implies that {z,x'i,X 2 ) —> {z,x'i,0), 
which also means that ( 0 ,xi,O) G £. But {z,x'i,0) G C x x {0} and hence £ cannot be disjoint from 


CxK 




4.4 


Ng X {0}, giving a contradiction. This completes the proof of part (A) of Theorem 
We now prove part (B). Let £ C £^ x be a nonempty irreducible state-space for network . Let 
(z,x) be any state in £. After finitely many reactions in the set ICrijl)), the state of hounded species will 
reach a state z' in some closed communication class Cg G C(0). Hence there exists a x' G Ng^ such that 
{z,x) — {z',x') G £. Moreover there exists a BCT-path of the form 

(C',0) = (C'.G'g) ^ ^ ^ = {G\A'), 

culminating in the leaf node (G', A') G L. Since L = Lmin the leaf node (G', A') is also minimal. We view the 


dynami cs of network Af'^ un der t he permutation a = erg (A') defined in Section 
and Proposition 


Lemma 


7.1 


2.2 


4.4 


Pick any zi G G'. Due to 


we know that for any xg G N{j^ ' , there exists a xi G ' and X2 G Ng^ ' 
such that (z',x') — (2;i,xi,X2) G £. However (2;i,xi,X2) is an element in G' x N{j^ ' x {0} and hence, as 
argued previously, we arrive at a contradiction unless £ = G' x ' x {0}. This completes the proof of part 
(B) of Theorem 4.4 □ 


8 Conclusion 


The aim of this paper is to provide a new tool for analyzing continuous-time Markov chain (CTMC) models 
of biomolecular reaction networks. Specifically we are interested in situations where the state-space for 
the CTMC needs to be countably infinite due to the lack of a global conservation relationship among all 
the species. Such situations arise frequently in Systems Biology as stochastic models generally describe 
the activity of a small subnetwork embedded within a larger network. We develop a simple procedure to 
systematically explore the space of conservation relations among species and represent the state-space of the 
CTMC in a special decomposed-form based on the copy-number ranges of all the species. This form can help 
in assessing the reachability relations and the communication structures within the infinite state-space of 
the underlying CTMC. In this context, the main goal of this paper is to construct a computational method 
for hnding all the closed communication classes for the CTMC. Such classes are natural attracting sets for 


the dynamics and they can also be viewed as irreducible state-spaces for the CTMC (see Section 2.11. Under 
the existence of a suitable Foster-Lyapunov function (see [24l[23] and Section]^, each irreducible state-space 
supports a unique stationary distribution and these distributions form the vertices of the full simplex (see 
(1.16)) of stationary distributions of the CTMC. 

As we discuss in this paper, finding irreducible state-spaces for networks where infinitely many states 
are accessible, is a challenging but a very important problem for several reasons. These reasons include 
understanding the stability and ergodic properties of networks (see Section]^, analyzing network topologies 
using methods from Transition Path Theory |28j and in obtaining the exact stationary distribution for a large 


class of networks (see Section 6.5). Furthermore our computational procedure for finding irreducible states- 


spaces can assist in an automated discovery of fast ergodic subnetworks for the application of quasi-stationary 
assumption (QSA) in the simulation of multiscale reaction networks (see Section 6.4 and |30M31l[52] l. Using 


our procedure we demonstrate in Section that generally networks from Systems Biology admit a single 
irreducible state-space which corresponds to the situation where the CTMC describing the reaction dynamics 
is ergodic and has a unique stationary distribution. We discuss how this information along with the structure 
of irreducible state-space can sometimes provide valuable biological insights into the design and functionality 


of the underlying network (see Sections 6.2 and 6.3). 
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Our approach works by classifying each species as free, bounded and restricted, based on their admissible 
copy-number ranges. The bounded species have a finite copy-number range and their dynamics evolves in a 
finite set On the other hand the copy-number range of free species is the infinite set of all non-negative 
integers Nq, and hence their dynamics evolves in an orthant Nq^. The restricted species imitate the dynamics 
of free species according to some affine function, and they can be removed from the network for the purpose of 
finding irreducible state-spaces (see Section 4.1). We develop a computational procedure that can provably 
locate all the irreducible state-spaces for the underlying CTMC within the infinite state-space £}, x Ng^. 
This is accomplished by suitably combining the matrix methods used in the finite state-space case where 
only bounded species are present (see Section 4.2), along with the construction of birth-death cascades for 
the free species (see Section 4.4). We demonstrate the versatility of our method through many examples 
from Systems Biology in Section From these examples one can also observe that the birth-death cascades 
correspond naturally to various important stages in the network and hence this cascade construction process 
facilitates a better understanding of the network design. 

Finally we would like to mention that since our computational procedure only involves basic linear- 
algebraic tasks (such as matrix computations, solving linear equations and Linear Programs etc.), it can be 
efficiently applied in very high dimensions. Hence our method can easily handle large reaction networks with 
several species and reactions. However computational issues may arise if the size Nb of the finite state-space 
Sb for bounded species becomes too large, as our method requires several computations with matrices of size 
Nb X Nb- We hope to resolve these issues in a future work. 
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